Approximately Counting Butterflies in Large Bipartite Graph Streams

Rundong Li, Pinghui Wang*, Peng Jia, Xiangliang Zhang, Junzhou Zhao, Jing Tao, Ye Yuan, Xiaohong Guan

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

9 引用 (Scopus)

摘要

Bipartite graphs widely exist in real-world scenarios and model binary relations like host-website, author-paper, and user-product. In bipartite graphs, a butterfly (i.e., $2\times 2$2×2 bi-clique) is the smallest non-trivial cohesive structure and plays an important role in applications such as anomaly detection. Considerable efforts focus on counting butterflies in static bipartite graphs. However, they suffer from high time and space complexity when the bipartite graph of interest is given as a stream of edges. Although there are methods for approximately counting butterflies from bipartite graph streams, they suffer from either low accuracy or high time complexity. Therefore, it is still a challenge to accurately estimate butterfly counts from bipartite graph streams in a short time. To address this issue, we develop novel algorithms by exploiting the bipartite nature, which subtly integrates sampling and sketching techniques. We provide accurate estimators for butterfly counts and derive simple yet exact formulas for bounding their errors. We also conduct extensive experiments on a variety of real-world large bipartite graphs. Experimental results demonstrate that our algorithms are up to 20.0 times more accurate and up to 286.3 times faster than state-of-the-art methods under the same memory usage.

源语言英语
页(从-至)5621-5635
页数15
期刊IEEE Transactions on Knowledge and Data Engineering
34
12
DOI
出版状态已出版 - 1 12月 2022

指纹

探究 'Approximately Counting Butterflies in Large Bipartite Graph Streams' 的科研主题。它们共同构成独一无二的指纹。

引用此