TY - JOUR
T1 - CLAP
T2 - Component-Level Approximate Processing for Low Tail Latency and High Result Accuracy in Cloud Online Services
AU - Han, Rui
AU - Huang, Siguang
AU - Wang, Zhentao
AU - Zhan, Jianfeng
N1 - Publisher Copyright:
© 1990-2012 IEEE.
PY - 2017/8/1
Y1 - 2017/8/1
N2 - Modern latency-critical online services such as search engines often process requests by consulting large input data spanning massive parallel components. Hence the tail latency of these components determines the service latency. To trade off result accuracy for tail latency reduction, existing techniques use the components responding before a specified deadline to produce approximate results. However, they skip a large proportion of components when load gets heavier, thus incurring large accuracy losses. In this paper, we propose CLAP to enable component-level approximate processing of requests for low tail latency and small accuracy losses. CLAP aggregates information of input data to create small aggregated data points. Using these points, CLAP reduces latency variance of parallel components and allows them to produce initial results quickly; CLAP also identifies the parts of input data most related to requests' result accuracies, thus first using these parts to improve the produced results to minimize accuracy losses. We evaluated CLAP using real services and datasets. The results show: (i) CLAP reduces tail latency by 6.46 times with accuracy losses of 2.2 percent compared to existing exact processing techniques; (ii) when using the same latency, CLAP reduces accuracy losses by 31.58 times compared to existing approximate processing techniques.
AB - Modern latency-critical online services such as search engines often process requests by consulting large input data spanning massive parallel components. Hence the tail latency of these components determines the service latency. To trade off result accuracy for tail latency reduction, existing techniques use the components responding before a specified deadline to produce approximate results. However, they skip a large proportion of components when load gets heavier, thus incurring large accuracy losses. In this paper, we propose CLAP to enable component-level approximate processing of requests for low tail latency and small accuracy losses. CLAP aggregates information of input data to create small aggregated data points. Using these points, CLAP reduces latency variance of parallel components and allows them to produce initial results quickly; CLAP also identifies the parts of input data most related to requests' result accuracies, thus first using these parts to improve the produced results to minimize accuracy losses. We evaluated CLAP using real services and datasets. The results show: (i) CLAP reduces tail latency by 6.46 times with accuracy losses of 2.2 percent compared to existing exact processing techniques; (ii) when using the same latency, CLAP reduces accuracy losses by 31.58 times compared to existing approximate processing techniques.
KW - Cloud online services
KW - aggregated data points
KW - component-level approximate processing
KW - result accuracy
KW - tail latency
UR - http://www.scopus.com/inward/record.url?scp=85028071752&partnerID=8YFLogxK
U2 - 10.1109/TPDS.2017.2650988
DO - 10.1109/TPDS.2017.2650988
M3 - Article
AN - SCOPUS:85028071752
SN - 1045-9219
VL - 28
SP - 2190
EP - 2203
JO - IEEE Transactions on Parallel and Distributed Systems
JF - IEEE Transactions on Parallel and Distributed Systems
IS - 8
M1 - 7812758
ER -