TY - GEN
T1 - Online Sequential Decision-Making with Unknown Delays
AU - Wu, Ping
AU - Huang, Heyan
AU - Liu, Zhengyang
N1 - Publisher Copyright:
© 2024 ACM.
PY - 2024/5/13
Y1 - 2024/5/13
N2 - In the field of online sequential decision-making, we address the problem with delays utilizing the framework of online convex optimization (OCO), where the feedback of a decision can arrive with an unknown delay. Unlike previous research that is limited to Euclidean norm and gradient information, we propose three families of delayed algorithms based on approximate solutions to handle different types of received feedback. Our proposed algorithms are versatile and applicable to universal norms. Specifically, we introduce a family of Follow the Delayed Regularized Leader algorithms for feedback with full information on the loss function, a family of Delayed Mirror Descent algorithms for feedback with gradient information on the loss function and a family of Simplified Delayed Mirror Descent algorithms for feedback with the value information of the loss function's gradients at corresponding decision points. For each type of algorithm, we provide corresponding regret bounds under cases of general convexity and relative strong convexity, respectively. We also demonstrate the efficiency of each algorithm under different norms through concrete examples. Furthermore, our theoretical results are consistent with the current best bounds when degenerated to standard settings.
AB - In the field of online sequential decision-making, we address the problem with delays utilizing the framework of online convex optimization (OCO), where the feedback of a decision can arrive with an unknown delay. Unlike previous research that is limited to Euclidean norm and gradient information, we propose three families of delayed algorithms based on approximate solutions to handle different types of received feedback. Our proposed algorithms are versatile and applicable to universal norms. Specifically, we introduce a family of Follow the Delayed Regularized Leader algorithms for feedback with full information on the loss function, a family of Delayed Mirror Descent algorithms for feedback with gradient information on the loss function and a family of Simplified Delayed Mirror Descent algorithms for feedback with the value information of the loss function's gradients at corresponding decision points. For each type of algorithm, we provide corresponding regret bounds under cases of general convexity and relative strong convexity, respectively. We also demonstrate the efficiency of each algorithm under different norms through concrete examples. Furthermore, our theoretical results are consistent with the current best bounds when degenerated to standard settings.
KW - approximate solution
KW - online convex optimization
KW - sequential decision-making
KW - unknown delays
UR - http://www.scopus.com/inward/record.url?scp=85194061571&partnerID=8YFLogxK
U2 - 10.1145/3589334.3645388
DO - 10.1145/3589334.3645388
M3 - Conference contribution
AN - SCOPUS:85194061571
T3 - WWW 2024 - Proceedings of the ACM Web Conference
SP - 4028
EP - 4036
BT - WWW 2024 - Proceedings of the ACM Web Conference
PB - Association for Computing Machinery, Inc
T2 - 33rd ACM Web Conference, WWW 2024
Y2 - 13 May 2024 through 17 May 2024
ER -