Efficiently Sampling and Estimating Hypergraphs By Hybrid Random Walk

Lingling Zhang*, Zhiwei Zhang, Guoren Wang, Ye Yuan

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Citations (Scopus)

Abstract

Hypergraphs provide a powerful tool for representing group interactions in complicated networks. Analyzing statical properties of hypergraphs by sampling is an increasing fundamental research problem in the field of data processing. However, the state-of-the-art sampling methods either focus on pairwise graphs or are insensitive to the structures formed by vertices and hyperedges, resulting in estimations with low accuracy and efficiency. To efficiently characterize the properties of both vertices and hyperedges, this paper first proposes a hybrid random walk based Markov Chain Monte Carlo (MCMC) model theoretically by carefully designing its mixture states and the transition matrix. For simplifying the implementation of this model, we develop an algorithm formed by vertex and hyperedge transitions saving costs for constructing mixture states in practice along with an estimating method for accurate estimations. Furthermore, we employ a non-backtracking strategy in the vertex transitions to accelerate the convergence of the hybrid random walk and propose to skip the sampled vertices in the hyperedge transitions to avoid being trapped in the local subgraph for improving accuracy and reducing query cost. Extensive experimental results on the real-world datasets confirm the higher accuracy and efficiency of our proposed methods than the sophisticated sampling methods.

Original languageEnglish
Title of host publicationProceedings - 2023 IEEE 39th International Conference on Data Engineering, ICDE 2023
PublisherIEEE Computer Society
Pages1273-1285
Number of pages13
ISBN (Electronic)9798350322279
DOIs
Publication statusPublished - 2023
Event39th IEEE International Conference on Data Engineering, ICDE 2023 - Anaheim, United States
Duration: 3 Apr 20237 Apr 2023

Publication series

NameProceedings - International Conference on Data Engineering
Volume2023-April
ISSN (Print)1084-4627

Conference

Conference39th IEEE International Conference on Data Engineering, ICDE 2023
Country/TerritoryUnited States
CityAnaheim
Period3/04/237/04/23

Fingerprint

Dive into the research topics of 'Efficiently Sampling and Estimating Hypergraphs By Hybrid Random Walk'. Together they form a unique fingerprint.

Cite this