TY - JOUR
T1 - GCache
T2 - Neighborhood-Guided Graph Caching in a Distributed Environment
AU - Yuan, Ye
AU - Lian, Xiang
AU - Chen, Lei
AU - Wang, Guoren
AU - Yu, Jeffrey Xu
AU - Wang, Yishu
AU - Ma, Yuliang
N1 - Publisher Copyright:
© 1990-2012 IEEE.
PY - 2019/11/1
Y1 - 2019/11/1
N2 - Distributed graph systems are becoming extremely popular due to their flexibility, scalability, and robustness in big graph processing. In order to improve the performance of the distributed graph systems, caching is a very effective technique to achieve fast response and reduce the communication cost. Existing works include online and offline caching algorithms. Online caching algorithms (such as least recently used (LRU) and most recently used (MRU)) are lightweight and flexible, however, neglect the topological properties of big graphs. Offline caching algorithms (such as node pre-ordered) consider the graph topology, but are very expensive and heavy. In this paper, we propose a novel caching mechanism, GraphCache (GCache), for big distributed graphs. GCache consists of an offline phase and an online phase, which inherits the advantages of online and offline caching algorithms. Specifically, the offline phase provides a caching model based on the bipartite graph clustering and give efficient algorithms to solve it. The online phase caches and schedules the graph clusters output from the offline phase, based on the LRU and MRU strategies. GCache can be seamlessly integrated into the state-of-the-art graph processing systems, e.g., Giraph. Finally, our experimental results demonstrate the feasibility of our proposed caching techniques in speeding up graph algorithms over distributed big graphs.
AB - Distributed graph systems are becoming extremely popular due to their flexibility, scalability, and robustness in big graph processing. In order to improve the performance of the distributed graph systems, caching is a very effective technique to achieve fast response and reduce the communication cost. Existing works include online and offline caching algorithms. Online caching algorithms (such as least recently used (LRU) and most recently used (MRU)) are lightweight and flexible, however, neglect the topological properties of big graphs. Offline caching algorithms (such as node pre-ordered) consider the graph topology, but are very expensive and heavy. In this paper, we propose a novel caching mechanism, GraphCache (GCache), for big distributed graphs. GCache consists of an offline phase and an online phase, which inherits the advantages of online and offline caching algorithms. Specifically, the offline phase provides a caching model based on the bipartite graph clustering and give efficient algorithms to solve it. The online phase caches and schedules the graph clusters output from the offline phase, based on the LRU and MRU strategies. GCache can be seamlessly integrated into the state-of-the-art graph processing systems, e.g., Giraph. Finally, our experimental results demonstrate the feasibility of our proposed caching techniques in speeding up graph algorithms over distributed big graphs.
KW - Distributed caching
KW - large graph
UR - http://www.scopus.com/inward/record.url?scp=85077394722&partnerID=8YFLogxK
U2 - 10.1109/TPDS.2019.2915300
DO - 10.1109/TPDS.2019.2915300
M3 - Article
AN - SCOPUS:85077394722
SN - 1045-9219
VL - 30
SP - 2463
EP - 2477
JO - IEEE Transactions on Parallel and Distributed Systems
JF - IEEE Transactions on Parallel and Distributed Systems
IS - 11
M1 - 8708952
ER -