TY - GEN
T1 - PCS
T2 - 44th International Conference on Parallel Processing, ICPP 2015
AU - Han, Rui
AU - Wang, Junwei
AU - Huang, Siguang
AU - Shao, Chenrong
AU - Zhan, Shulin
AU - Zhan, Jianfeng
AU - Vazquez-Poletti, Jose Luis
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/12/8
Y1 - 2015/12/8
N2 - Modern latency-critical online services often rely on composing results from a large number of server components. Hence the tail latency (e.g. The 99th percentile of response time), rather than the average, of these components determines the overall service performance. When hosted on a cloud environment, the components of a service typically co-locate with short batch jobs to increase machine utilizations, and share and contend resources such as caches and I/O bandwidths with them. The highly dynamic nature of batch jobs in terms of their workload types and input sizes causes continuously changing performance interference to individual components, hence leading to their latency variability and high tail latency. However, existing techniques either ignore such fine-grained component latency variability when managing service performance, or rely on executing redundant requests to reduce the tail latency, which adversely deteriorate the service performance when load gets heavier. In this paper, we propose PCS, a predictive and component-level scheduling framework to reduce tail latency for large-scale, parallel online services. It uses an analytical performance model to simultaneously predict the component latency and the overall service performance on different nodes. Based on the predicted performance, the scheduler identifies straggling components and conducts near-optimal component-node allocations to adapt to the changing performance interferences from batch jobs. We demonstrate that, using realistic workloads, the proposed scheduler reduces the component tail latency by an average of 67.05% and the average overall service latency by 64.16% compared with the state-of-the-art techniques on reducing tail latency.
AB - Modern latency-critical online services often rely on composing results from a large number of server components. Hence the tail latency (e.g. The 99th percentile of response time), rather than the average, of these components determines the overall service performance. When hosted on a cloud environment, the components of a service typically co-locate with short batch jobs to increase machine utilizations, and share and contend resources such as caches and I/O bandwidths with them. The highly dynamic nature of batch jobs in terms of their workload types and input sizes causes continuously changing performance interference to individual components, hence leading to their latency variability and high tail latency. However, existing techniques either ignore such fine-grained component latency variability when managing service performance, or rely on executing redundant requests to reduce the tail latency, which adversely deteriorate the service performance when load gets heavier. In this paper, we propose PCS, a predictive and component-level scheduling framework to reduce tail latency for large-scale, parallel online services. It uses an analytical performance model to simultaneously predict the component latency and the overall service performance on different nodes. Based on the predicted performance, the scheduler identifies straggling components and conducts near-optimal component-node allocations to adapt to the changing performance interferences from batch jobs. We demonstrate that, using realistic workloads, the proposed scheduler reduces the component tail latency by an average of 67.05% and the average overall service latency by 64.16% compared with the state-of-the-art techniques on reducing tail latency.
KW - Cloud online services
KW - Component latency variability
KW - Predictive scheduler
KW - Tail latency
UR - http://www.scopus.com/inward/record.url?scp=84976463179&partnerID=8YFLogxK
U2 - 10.1109/ICPP.2015.58
DO - 10.1109/ICPP.2015.58
M3 - Conference contribution
AN - SCOPUS:84976463179
T3 - Proceedings of the International Conference on Parallel Processing
SP - 490
EP - 499
BT - Proceedings - 2015 44th International Annual Conference on Parallel Processing, ICPP 2015
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 1 September 2015 through 4 September 2015
ER -