TY - JOUR
T1 - QORA
T2 - Neural-Enhanced Interference-Aware Resource Provisioning for Serverless Computing
AU - Ma, Ruifeng
AU - Zhan, Yufeng
AU - Wu, Chuge
AU - Hong, Zicong
AU - Ali, Yasir
AU - Xia, Yuanqing
N1 - Publisher Copyright:
© 2004-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - Serverless is an emerging cloud paradigm that offers fine-grained resource sharing through serverless functions. However, this resource sharing can cause interference, leading to performance degradation and QoS violations. Existing white box-based approaches for serverless resource provision often demand extensive expert knowledge, which is challenging to obtain due to the complexity of interference sources. This paper proposes QORA, a neural-enhanced interference-aware resource provisioning system for serverless computing. We model the resource provisioning of serverless functions as a novel combinatorial optimization problem, wherein the constraints on the queries per second are derived from neural network performance model. By leveraging neural networks to model the nonlinear performance fluctuations under various interference sources, our approach better captures the real-world behavior of serverless functions. To solve the formulated problem efficiently, rather than adopting commercial optimizer solvers like Gurobi, we propose a two-stage-VNS algorithm that searches discrete variables more efficiently and supports Sigmoid activations, avoiding introducing redundant discrete variables. Unlike pure machine learning methods lacking theoretical optimal guarantees, our approach is rigorously proven globally optimal based on optimization theory. We implement QORA on Kubernetes as a serverless system automating resource provisioning. Experimental results demonstrate that QORA reduces the QoS violation rate by 98% while reducing up to 35% resource costs compared with the state-of-the-arts. Note to Practitioners - From the perspective of cloud service providers, this paper considers the automatic resource provisioning for serverless functions. To improve hardware utilization, cloud providers tend to co-locate serverless functions on the same server. However, co-located functions compete for shared resources (memory bandwidth, L3 cache, etc.), which causes interference and leads to performance degradation and QoS violations. We use neural networks to build the performance models of interference-prone serverless functions and form the resource allocation optimization problem with neural network performance models as constraints. Compared to white box modeling methods, our neural network modeling adapts to complex and variable interference. Compared to deep reinforcement learning methods, our combinatorial optimization methods have stronger interpretability. In order to solve this optimization problem efficiently, we design the two-stage-VNS solution algorithm. We implement QORA on Kubernetes as a serverless system, which can automatically allocate computing resources. Experiments with small-scale real clusters and large-scale simulations demonstrate the effectiveness of QORA.
AB - Serverless is an emerging cloud paradigm that offers fine-grained resource sharing through serverless functions. However, this resource sharing can cause interference, leading to performance degradation and QoS violations. Existing white box-based approaches for serverless resource provision often demand extensive expert knowledge, which is challenging to obtain due to the complexity of interference sources. This paper proposes QORA, a neural-enhanced interference-aware resource provisioning system for serverless computing. We model the resource provisioning of serverless functions as a novel combinatorial optimization problem, wherein the constraints on the queries per second are derived from neural network performance model. By leveraging neural networks to model the nonlinear performance fluctuations under various interference sources, our approach better captures the real-world behavior of serverless functions. To solve the formulated problem efficiently, rather than adopting commercial optimizer solvers like Gurobi, we propose a two-stage-VNS algorithm that searches discrete variables more efficiently and supports Sigmoid activations, avoiding introducing redundant discrete variables. Unlike pure machine learning methods lacking theoretical optimal guarantees, our approach is rigorously proven globally optimal based on optimization theory. We implement QORA on Kubernetes as a serverless system automating resource provisioning. Experimental results demonstrate that QORA reduces the QoS violation rate by 98% while reducing up to 35% resource costs compared with the state-of-the-arts. Note to Practitioners - From the perspective of cloud service providers, this paper considers the automatic resource provisioning for serverless functions. To improve hardware utilization, cloud providers tend to co-locate serverless functions on the same server. However, co-located functions compete for shared resources (memory bandwidth, L3 cache, etc.), which causes interference and leads to performance degradation and QoS violations. We use neural networks to build the performance models of interference-prone serverless functions and form the resource allocation optimization problem with neural network performance models as constraints. Compared to white box modeling methods, our neural network modeling adapts to complex and variable interference. Compared to deep reinforcement learning methods, our combinatorial optimization methods have stronger interpretability. In order to solve this optimization problem efficiently, we design the two-stage-VNS solution algorithm. We implement QORA on Kubernetes as a serverless system, which can automatically allocate computing resources. Experiments with small-scale real clusters and large-scale simulations demonstrate the effectiveness of QORA.
KW - performance interference
KW - resource provisioning
KW - Serverless computing
UR - http://www.scopus.com/inward/record.url?scp=85215376338&partnerID=8YFLogxK
U2 - 10.1109/TASE.2025.3526197
DO - 10.1109/TASE.2025.3526197
M3 - Article
AN - SCOPUS:85215376338
SN - 1545-5955
JO - IEEE Transactions on Automation Science and Engineering
JF - IEEE Transactions on Automation Science and Engineering
ER -