TY - JOUR
T1 - Stochastic Frank-Wolfe Algorithm for Constrained Bilevel Optimization With Improved Per-Iteration Complexity
AU - Hou, Jie
AU - Zeng, Xianlin
AU - Cui, Shisheng
AU - Jiang, Xia
AU - Sun, Jian
N1 - Publisher Copyright:
© 1991-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - Bilevel optimization, where one optimization problem is inherently nested within another, has gained significant attention due to its extensive applications in machine learning, such as hyperparameter optimization and meta-learning. Most existing algorithms are designed to address unconstrained bilevel optimization problems, with few are capable of effectively tackling complex constrained settings. To address this gap, we propose a novel, fully single-loop stochastic Frank-Wolfe algorithm. This algorithm incorporates a Hessian-inverse-vector approximation technique, momentum-based gradient tracking, and a Frank-Wolfe update. Our proposed algorithm improves per-iteration complexity and achieves lower sample complexity compared to existing Frank-Wolfe algorithms for bilevel optimization. We also conduct numerical simulations to demonstrate the efficacy of our algorithm compared to state-of-the-art methods.
AB - Bilevel optimization, where one optimization problem is inherently nested within another, has gained significant attention due to its extensive applications in machine learning, such as hyperparameter optimization and meta-learning. Most existing algorithms are designed to address unconstrained bilevel optimization problems, with few are capable of effectively tackling complex constrained settings. To address this gap, we propose a novel, fully single-loop stochastic Frank-Wolfe algorithm. This algorithm incorporates a Hessian-inverse-vector approximation technique, momentum-based gradient tracking, and a Frank-Wolfe update. Our proposed algorithm improves per-iteration complexity and achieves lower sample complexity compared to existing Frank-Wolfe algorithms for bilevel optimization. We also conduct numerical simulations to demonstrate the efficacy of our algorithm compared to state-of-the-art methods.
KW - bilevel optimization
KW - hypergradient approximation
KW - Stochastic Frank-Wolfe algorithm
KW - variance reduction technique
UR - https://www.scopus.com/pages/publications/105012937433
U2 - 10.1109/TSP.2025.3596231
DO - 10.1109/TSP.2025.3596231
M3 - Article
AN - SCOPUS:105012937433
SN - 1053-587X
VL - 73
SP - 3237
EP - 3252
JO - IEEE Transactions on Signal Processing
JF - IEEE Transactions on Signal Processing
ER -