TY - GEN
T1 - A Novel Data Placement Strategy for Data-Sharing Scientific Workflows in Heterogeneous Edge-Cloud Computing Environments
AU - Du, Xin
AU - Tang, Songtao
AU - Lu, Zhihui
AU - Wet, Jie
AU - Gai, Keke
AU - Hung, Patrick C.K.
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/10
Y1 - 2020/10
N2 - The deployment of datasets in the heterogeneous edge-cloud computing paradigm has received increasing attention in state-of-the-art research. However, due to their large sizes and the existence of private scientific datasets, finding an optimal data placement strategy that can minimize data transmission as well as improve performance, remains a persistent problem. In this study, the advantages of both edge and cloud computing are combined to construct a data placement model that works for multiple scientific workflows. Apparently, the most difficult research challenge is to provide a data placement strategy to consider shared datasets, both within individual and among multiple workflows, across various geographically distributed environments. According to the constructed model, not only the storage capacity of edge micro-datacenters, but also the data transfer between multiple clouds across regions must be considered. To address this issue, we considered the characteristics of this model and identified the factors that are causing the transmission delay. The authors propose using a discrete particle swarm optimization algorithm with differential evolution (DE-DPSO) to distribute dataset during workflow execution. Based on this, a new data placement strategy named DE-DPSO-DPS is proposed. DE-DPSO-DPS is evaluated using several experiments designed in simulated heterogeneous edge-cloud computing environments. The results demonstrate that our data placement strategy can effectively reduce the data transmission time and achieve superior performance as compared to traditional strategies for data-sharing scientific workflows.
AB - The deployment of datasets in the heterogeneous edge-cloud computing paradigm has received increasing attention in state-of-the-art research. However, due to their large sizes and the existence of private scientific datasets, finding an optimal data placement strategy that can minimize data transmission as well as improve performance, remains a persistent problem. In this study, the advantages of both edge and cloud computing are combined to construct a data placement model that works for multiple scientific workflows. Apparently, the most difficult research challenge is to provide a data placement strategy to consider shared datasets, both within individual and among multiple workflows, across various geographically distributed environments. According to the constructed model, not only the storage capacity of edge micro-datacenters, but also the data transfer between multiple clouds across regions must be considered. To address this issue, we considered the characteristics of this model and identified the factors that are causing the transmission delay. The authors propose using a discrete particle swarm optimization algorithm with differential evolution (DE-DPSO) to distribute dataset during workflow execution. Based on this, a new data placement strategy named DE-DPSO-DPS is proposed. DE-DPSO-DPS is evaluated using several experiments designed in simulated heterogeneous edge-cloud computing environments. The results demonstrate that our data placement strategy can effectively reduce the data transmission time and achieve superior performance as compared to traditional strategies for data-sharing scientific workflows.
KW - Heterogeneous edge-cloud computing environments
KW - data placement
KW - data-sharing
KW - scientific workflows
UR - http://www.scopus.com/inward/record.url?scp=85099307692&partnerID=8YFLogxK
U2 - 10.1109/ICWS49710.2020.00073
DO - 10.1109/ICWS49710.2020.00073
M3 - Conference contribution
AN - SCOPUS:85099307692
T3 - Proceedings - 2020 IEEE 13th International Conference on Web Services, ICWS 2020
SP - 498
EP - 507
BT - Proceedings - 2020 IEEE 13th International Conference on Web Services, ICWS 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 13th IEEE International Conference on Web Services, ICWS 2020
Y2 - 18 October 2020 through 24 October 2020
ER -