TY - JOUR
T1 - On the constrained online convex optimization with feedback delay
AU - Huang, Heyan
AU - Wu, Ping
AU - Lu, Haolin
AU - Liu, Zhengyang
N1 - Publisher Copyright:
© 2025 Elsevier Ltd
PY - 2025/8/25
Y1 - 2025/8/25
N2 - We investigate the problem of online convex optimization (OCO) under feedback delay, where feedback for a decision is received after a delay, and long-term constraints, where constraints can be violated in intermediate iterations but must be satisfied over the long run. Existing approaches are primarily limited to fixed delay settings and general convex loss functions. In this paper, we employ a stricter metric based on cumulative constraint violations. We first propose a novel algorithm tailored for the fixed d-slot delay setting, achieving a regret bound of O(dT) and a cumulative constraint violation of O (T[Formula presented]), demonstrating superior performance compared to existing results. Moreover, when the loss functions are strongly convex, the regret and violation bounds can be further reduced to O (dlnT) and O (dlnT), respectively. Additionally, we extend our algorithm to the more realistic re-indexed delay setting, achieving O(dT) regret and O(T[Formula presented]) cumulative constraint violation. Under strong convexity, these bounds are further improved to O(dˆlnT) and O(dˆlnT), where dˆ=maxt∈[T]dt denotes the maximum delay.
AB - We investigate the problem of online convex optimization (OCO) under feedback delay, where feedback for a decision is received after a delay, and long-term constraints, where constraints can be violated in intermediate iterations but must be satisfied over the long run. Existing approaches are primarily limited to fixed delay settings and general convex loss functions. In this paper, we employ a stricter metric based on cumulative constraint violations. We first propose a novel algorithm tailored for the fixed d-slot delay setting, achieving a regret bound of O(dT) and a cumulative constraint violation of O (T[Formula presented]), demonstrating superior performance compared to existing results. Moreover, when the loss functions are strongly convex, the regret and violation bounds can be further reduced to O (dlnT) and O (dlnT), respectively. Additionally, we extend our algorithm to the more realistic re-indexed delay setting, achieving O(dT) regret and O(T[Formula presented]) cumulative constraint violation. Under strong convexity, these bounds are further improved to O(dˆlnT) and O(dˆlnT), where dˆ=maxt∈[T]dt denotes the maximum delay.
KW - Constraint violation
KW - Feedback delay
KW - Long-term constraint
KW - Online convex optimization
KW - Regret bound
UR - http://www.scopus.com/inward/record.url?scp=105005275873&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2025.127871
DO - 10.1016/j.eswa.2025.127871
M3 - Article
AN - SCOPUS:105005275873
SN - 0957-4174
VL - 287
JO - Expert Systems with Applications
JF - Expert Systems with Applications
M1 - 127871
ER -