QORA: Neural-Enhanced Interference-Aware Resource Provisioning for Serverless Computing

Ruifeng Ma, Yufeng Zhan*, Chuge Wu, Zicong Hong, Yasir Ali, Yuanqing Xia*

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

摘要

Serverless is an emerging cloud paradigm that offers fine-grained resource sharing through serverless functions. However, this resource sharing can cause interference, leading to performance degradation and QoS violations. Existing white box-based approaches for serverless resource provision often demand extensive expert knowledge, which is challenging to obtain due to the complexity of interference sources. This paper proposes QORA, a neural-enhanced interference-aware resource provisioning system for serverless computing. We model the resource provisioning of serverless functions as a novel combinatorial optimization problem, wherein the constraints on the queries per second are derived from neural network performance model. By leveraging neural networks to model the nonlinear performance fluctuations under various interference sources, our approach better captures the real-world behavior of serverless functions. To solve the formulated problem efficiently, rather than adopting commercial optimizer solvers like Gurobi, we propose a two-stage-VNS algorithm that searches discrete variables more efficiently and supports Sigmoid activations, avoiding introducing redundant discrete variables. Unlike pure machine learning methods lacking theoretical optimal guarantees, our approach is rigorously proven globally optimal based on optimization theory. We implement QORA on Kubernetes as a serverless system automating resource provisioning. Experimental results demonstrate that QORA reduces the QoS violation rate by 98% while reducing up to 35% resource costs compared with the state-of-the-arts. Note to Practitioners - From the perspective of cloud service providers, this paper considers the automatic resource provisioning for serverless functions. To improve hardware utilization, cloud providers tend to co-locate serverless functions on the same server. However, co-located functions compete for shared resources (memory bandwidth, L3 cache, etc.), which causes interference and leads to performance degradation and QoS violations. We use neural networks to build the performance models of interference-prone serverless functions and form the resource allocation optimization problem with neural network performance models as constraints. Compared to white box modeling methods, our neural network modeling adapts to complex and variable interference. Compared to deep reinforcement learning methods, our combinatorial optimization methods have stronger interpretability. In order to solve this optimization problem efficiently, we design the two-stage-VNS solution algorithm. We implement QORA on Kubernetes as a serverless system, which can automatically allocate computing resources. Experiments with small-scale real clusters and large-scale simulations demonstrate the effectiveness of QORA.

指纹

探究 'QORA: Neural-Enhanced Interference-Aware Resource Provisioning for Serverless Computing' 的科研主题。它们共同构成独一无二的指纹。

引用此

Ma, R., Zhan, Y., Wu, C., Hong, Z., Ali, Y., & Xia, Y. (已接受/印刷中). QORA: Neural-Enhanced Interference-Aware Resource Provisioning for Serverless Computing. IEEE Transactions on Automation Science and Engineering. https://doi.org/10.1109/TASE.2025.3526197