TY - JOUR
T1 - Cost-optimized microblog distribution over geo-distributed data centers
T2 - Insights from cross-media analysis
AU - Hu, Han
AU - Wen, Yonggang
AU - Chua, Tat Seng
AU - Li, Xuelong
N1 - Publisher Copyright:
© 2017 ACM.
PY - 2017/4
Y1 - 2017/4
N2 - The unprecedent growth of microblog services poses significant challenges on network traffic and service latency to the underlay infrastructure (i.e., geo-distributed data centers). Furthermore, the dynamic evolution in microblog status generates a huge workload on data consistence maintenance. In this article, motivated by insights of cross-media analysis-based propagation patterns, we propose a novel cache strategy for microblog service systems to reduce the inter-data center traffic and consistence maintenance cost, while achieving low service latency. Specifically, we first present a microblog classification method, which utilizes the external knowledge from correlated domains, to categorize microblogs. Then we conduct a large-scale measurement on a representative online social network system to study the category-based propagation diversity on region and time scales. These insights illustrate social common habits on creating and consuming microblogs and further motivate our architecture design. Finally, we formulate the content cache problem as a constrained optimization problem. By jointly using the Lyapunov optimization framework and simplex gradient method, we find the optimal online control strategy. Extensive trace-driven experiments further demonstrate that our algorithm reduces the system cost by 24.5% against traditional approaches with the same service latency.
AB - The unprecedent growth of microblog services poses significant challenges on network traffic and service latency to the underlay infrastructure (i.e., geo-distributed data centers). Furthermore, the dynamic evolution in microblog status generates a huge workload on data consistence maintenance. In this article, motivated by insights of cross-media analysis-based propagation patterns, we propose a novel cache strategy for microblog service systems to reduce the inter-data center traffic and consistence maintenance cost, while achieving low service latency. Specifically, we first present a microblog classification method, which utilizes the external knowledge from correlated domains, to categorize microblogs. Then we conduct a large-scale measurement on a representative online social network system to study the category-based propagation diversity on region and time scales. These insights illustrate social common habits on creating and consuming microblogs and further motivate our architecture design. Finally, we formulate the content cache problem as a constrained optimization problem. By jointly using the Lyapunov optimization framework and simplex gradient method, we find the optimal online control strategy. Extensive trace-driven experiments further demonstrate that our algorithm reduces the system cost by 24.5% against traditional approaches with the same service latency.
KW - Cross-media analysis
KW - Data center
KW - Performance optimization
KW - Social media analytics
UR - http://www.scopus.com/inward/record.url?scp=85018843170&partnerID=8YFLogxK
U2 - 10.1145/3014431
DO - 10.1145/3014431
M3 - Article
AN - SCOPUS:85018843170
SN - 2157-6904
VL - 8
JO - ACM Transactions on Intelligent Systems and Technology
JF - ACM Transactions on Intelligent Systems and Technology
IS - 3
M1 - 40
ER -