TY - JOUR
T1 - Minimizing Tardiness for Data-Intensive Applications in Heterogeneous Systems
T2 - A Matching Theory Perspective
AU - Xu, Ke
AU - Lv, Liang
AU - Li, Tong
AU - Shen, Meng
AU - Wang, Haiyang
AU - Yang, Kun
N1 - Publisher Copyright:
© 1990-2012 IEEE.
PY - 2020/1/1
Y1 - 2020/1/1
N2 - The increasing data requirements of Internet applications have driven a dramatic surge in developing new programming paradigms and complex scheduling algorithms to handle data-intensive workloads. Due to the expanding volume and the variety of such flows, their raw data are often processed on Intermediate Processing Nodes (IPNs) before being sent to servers. However, the intermediate processing constraint is rarely considered in existing flow computing models. This paper aims to minimize the tardiness of data-intensive applications in the presence of intermediate processing constraint. Motivating cases show that the tardiness is affected by both IPN locations and flow dispatching strategies. Based on the observation that dispatching flows to IPNs is essentially building a matching between flows and IPNs, a novel solution is proposed based on matching theory. In the deployment phase, a tardiness-aware deferred acceptance algorithm is developed to optimize IPN locations. In the operation phase, the Power-of-D paradigm and matching theory are combined together to dispatch flows efficiently. Evaluation results show that our solution effectively minimizes the total tardiness of data-intensive applications in heterogeneous systems.
AB - The increasing data requirements of Internet applications have driven a dramatic surge in developing new programming paradigms and complex scheduling algorithms to handle data-intensive workloads. Due to the expanding volume and the variety of such flows, their raw data are often processed on Intermediate Processing Nodes (IPNs) before being sent to servers. However, the intermediate processing constraint is rarely considered in existing flow computing models. This paper aims to minimize the tardiness of data-intensive applications in the presence of intermediate processing constraint. Motivating cases show that the tardiness is affected by both IPN locations and flow dispatching strategies. Based on the observation that dispatching flows to IPNs is essentially building a matching between flows and IPNs, a novel solution is proposed based on matching theory. In the deployment phase, a tardiness-aware deferred acceptance algorithm is developed to optimize IPN locations. In the operation phase, the Power-of-D paradigm and matching theory are combined together to dispatch flows efficiently. Evaluation results show that our solution effectively minimizes the total tardiness of data-intensive applications in heterogeneous systems.
KW - Heterogeneous system
KW - data-intensive application
KW - matching theory
KW - power-of-D
UR - http://www.scopus.com/inward/record.url?scp=85070685168&partnerID=8YFLogxK
U2 - 10.1109/TPDS.2019.2930992
DO - 10.1109/TPDS.2019.2930992
M3 - Article
AN - SCOPUS:85070685168
SN - 1045-9219
VL - 31
SP - 144
EP - 158
JO - IEEE Transactions on Parallel and Distributed Systems
JF - IEEE Transactions on Parallel and Distributed Systems
IS - 1
M1 - 8772187
ER -