Interference-Aware Component Scheduling for Reducing Tail Latency in Cloud Interactive Services

Rui Han; Junwei Wang; Siguang Huang; Chenrong Shao; Shulin Zhan; Jianfeng Zhan; Jose Luis Vazquez-Poletti

doi:10.1109/ICDCS.2015.88

Interference-Aware Component Scheduling for Reducing Tail Latency in Cloud Interactive Services

Rui Han, Junwei Wang, Siguang Huang, Chenrong Shao, Shulin Zhan, Jianfeng Zhan, Jose Luis Vazquez-Poletti

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

2 Citations (Scopus)

Abstract

Large-scale interactive services usually divide requests into multiple sub-requests and distribute them to a large number of server components for parallel execution. Hence the tail latency (i.e. The slowest component's latency) of these components determines the overall service latency. On a cloud platform, each component shares and competes node resources such as caches and I/O bandwidths with its co-located jobs, hence inevitably suffering from their performance interference. In this paper, we study the short-running jobs in a 12k-node Google cluster to illustrate the dynamic resource demands of these jobs, resulting in both individual components' latency variability over time and across different nodes and hence posing a major challenge to maintain low tail latency. Given this motivation, this paper introduces a dynamic and interference-aware scheduler for large-scale, parallel cloud services. At each scheduling interval, it collects workload and resource contention information of a running service, and predicts both the component latency on different nodes and the overall service performance. Based on the predicted performance, the scheduler identifies straggling components and conducts near-optimal component-node allocations to adapt to the changing workloads and performance interferences. We demonstrate that, using realistic workloads, the proposed approach achieves significant reductions in tail latency compared to the basic approach without scheduling.

Original language	English
Title of host publication	Proceedings - 2015 IEEE 35th International Conference on Distributed Computing Systems, ICDCS 2015
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	744-745
Number of pages	2
ISBN (Electronic)	9781467372145
DOIs	https://doi.org/10.1109/ICDCS.2015.88
Publication status	Published - 22 Jul 2015
Externally published	Yes
Event	35th IEEE International Conference on Distributed Computing Systems, ICDCS 2015 - Columbus, United States Duration: 29 Jun 2015 → 2 Jul 2015

Publication series

Name	Proceedings - International Conference on Distributed Computing Systems
Volume	2015-July

Conference

Conference	35th IEEE International Conference on Distributed Computing Systems, ICDCS 2015
Country/Territory	United States
City	Columbus
Period	29/06/15 → 2/07/15

Keywords

Cloud interactive services
Interference-aware scheduler
component latency variability
tail latency

Access to Document

10.1109/ICDCS.2015.88

Cite this

Han, R., Wang, J., Huang, S., Shao, C., Zhan, S., Zhan, J., & Vazquez-Poletti, J. L. (2015). Interference-Aware Component Scheduling for Reducing Tail Latency in Cloud Interactive Services. In Proceedings - 2015 IEEE 35th International Conference on Distributed Computing Systems, ICDCS 2015 (pp. 744-745). Article 7164966 (Proceedings - International Conference on Distributed Computing Systems; Vol. 2015-July). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICDCS.2015.88

Han, Rui ; Wang, Junwei ; Huang, Siguang et al. / Interference-Aware Component Scheduling for Reducing Tail Latency in Cloud Interactive Services. Proceedings - 2015 IEEE 35th International Conference on Distributed Computing Systems, ICDCS 2015. Institute of Electrical and Electronics Engineers Inc., 2015. pp. 744-745 (Proceedings - International Conference on Distributed Computing Systems).

@inproceedings{e95a661b886e4dbeb08ad51b94f5738a,

title = "Interference-Aware Component Scheduling for Reducing Tail Latency in Cloud Interactive Services",

abstract = "Large-scale interactive services usually divide requests into multiple sub-requests and distribute them to a large number of server components for parallel execution. Hence the tail latency (i.e. The slowest component's latency) of these components determines the overall service latency. On a cloud platform, each component shares and competes node resources such as caches and I/O bandwidths with its co-located jobs, hence inevitably suffering from their performance interference. In this paper, we study the short-running jobs in a 12k-node Google cluster to illustrate the dynamic resource demands of these jobs, resulting in both individual components' latency variability over time and across different nodes and hence posing a major challenge to maintain low tail latency. Given this motivation, this paper introduces a dynamic and interference-aware scheduler for large-scale, parallel cloud services. At each scheduling interval, it collects workload and resource contention information of a running service, and predicts both the component latency on different nodes and the overall service performance. Based on the predicted performance, the scheduler identifies straggling components and conducts near-optimal component-node allocations to adapt to the changing workloads and performance interferences. We demonstrate that, using realistic workloads, the proposed approach achieves significant reductions in tail latency compared to the basic approach without scheduling.",

keywords = "Cloud interactive services, Interference-aware scheduler, component latency variability, tail latency",

author = "Rui Han and Junwei Wang and Siguang Huang and Chenrong Shao and Shulin Zhan and Jianfeng Zhan and Vazquez-Poletti, {Jose Luis}",

note = "Publisher Copyright: {\textcopyright} 2015 IEEE.; 35th IEEE International Conference on Distributed Computing Systems, ICDCS 2015 ; Conference date: 29-06-2015 Through 02-07-2015",

year = "2015",

month = jul,

day = "22",

doi = "10.1109/ICDCS.2015.88",

language = "English",

series = "Proceedings - International Conference on Distributed Computing Systems",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "744--745",

booktitle = "Proceedings - 2015 IEEE 35th International Conference on Distributed Computing Systems, ICDCS 2015",

address = "United States",

}

Han, R, Wang, J, Huang, S, Shao, C, Zhan, S, Zhan, J & Vazquez-Poletti, JL 2015, Interference-Aware Component Scheduling for Reducing Tail Latency in Cloud Interactive Services. in Proceedings - 2015 IEEE 35th International Conference on Distributed Computing Systems, ICDCS 2015., 7164966, Proceedings - International Conference on Distributed Computing Systems, vol. 2015-July, Institute of Electrical and Electronics Engineers Inc., pp. 744-745, 35th IEEE International Conference on Distributed Computing Systems, ICDCS 2015, Columbus, United States, 29/06/15. https://doi.org/10.1109/ICDCS.2015.88

Interference-Aware Component Scheduling for Reducing Tail Latency in Cloud Interactive Services. / Han, Rui; Wang, Junwei; Huang, Siguang et al.
Proceedings - 2015 IEEE 35th International Conference on Distributed Computing Systems, ICDCS 2015. Institute of Electrical and Electronics Engineers Inc., 2015. p. 744-745 7164966 (Proceedings - International Conference on Distributed Computing Systems; Vol. 2015-July).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Interference-Aware Component Scheduling for Reducing Tail Latency in Cloud Interactive Services

AU - Han, Rui

AU - Wang, Junwei

AU - Huang, Siguang

AU - Shao, Chenrong

AU - Zhan, Shulin

AU - Zhan, Jianfeng

AU - Vazquez-Poletti, Jose Luis

PY - 2015/7/22

Y1 - 2015/7/22

N2 - Large-scale interactive services usually divide requests into multiple sub-requests and distribute them to a large number of server components for parallel execution. Hence the tail latency (i.e. The slowest component's latency) of these components determines the overall service latency. On a cloud platform, each component shares and competes node resources such as caches and I/O bandwidths with its co-located jobs, hence inevitably suffering from their performance interference. In this paper, we study the short-running jobs in a 12k-node Google cluster to illustrate the dynamic resource demands of these jobs, resulting in both individual components' latency variability over time and across different nodes and hence posing a major challenge to maintain low tail latency. Given this motivation, this paper introduces a dynamic and interference-aware scheduler for large-scale, parallel cloud services. At each scheduling interval, it collects workload and resource contention information of a running service, and predicts both the component latency on different nodes and the overall service performance. Based on the predicted performance, the scheduler identifies straggling components and conducts near-optimal component-node allocations to adapt to the changing workloads and performance interferences. We demonstrate that, using realistic workloads, the proposed approach achieves significant reductions in tail latency compared to the basic approach without scheduling.

AB - Large-scale interactive services usually divide requests into multiple sub-requests and distribute them to a large number of server components for parallel execution. Hence the tail latency (i.e. The slowest component's latency) of these components determines the overall service latency. On a cloud platform, each component shares and competes node resources such as caches and I/O bandwidths with its co-located jobs, hence inevitably suffering from their performance interference. In this paper, we study the short-running jobs in a 12k-node Google cluster to illustrate the dynamic resource demands of these jobs, resulting in both individual components' latency variability over time and across different nodes and hence posing a major challenge to maintain low tail latency. Given this motivation, this paper introduces a dynamic and interference-aware scheduler for large-scale, parallel cloud services. At each scheduling interval, it collects workload and resource contention information of a running service, and predicts both the component latency on different nodes and the overall service performance. Based on the predicted performance, the scheduler identifies straggling components and conducts near-optimal component-node allocations to adapt to the changing workloads and performance interferences. We demonstrate that, using realistic workloads, the proposed approach achieves significant reductions in tail latency compared to the basic approach without scheduling.

KW - Cloud interactive services

KW - Interference-aware scheduler

KW - component latency variability

KW - tail latency

UR - http://www.scopus.com/inward/record.url?scp=84944321452&partnerID=8YFLogxK

U2 - 10.1109/ICDCS.2015.88

DO - 10.1109/ICDCS.2015.88

M3 - Conference contribution

AN - SCOPUS:84944321452

T3 - Proceedings - International Conference on Distributed Computing Systems

SP - 744

EP - 745

BT - Proceedings - 2015 IEEE 35th International Conference on Distributed Computing Systems, ICDCS 2015

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 35th IEEE International Conference on Distributed Computing Systems, ICDCS 2015

Y2 - 29 June 2015 through 2 July 2015

ER -

Han R, Wang J, Huang S, Shao C, Zhan S, Zhan J et al. Interference-Aware Component Scheduling for Reducing Tail Latency in Cloud Interactive Services. In Proceedings - 2015 IEEE 35th International Conference on Distributed Computing Systems, ICDCS 2015. Institute of Electrical and Electronics Engineers Inc. 2015. p. 744-745. 7164966. (Proceedings - International Conference on Distributed Computing Systems). doi: 10.1109/ICDCS.2015.88

Interference-Aware Component Scheduling for Reducing Tail Latency in Cloud Interactive Services

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this