The dynamic bloom filters

Deke Guo; Jie Wu; Honghui Chen; Ye Yuan; Xueshan Luo

doi:10.1109/TKDE.2009.57

The dynamic bloom filters

Deke Guo^*, Jie Wu, Honghui Chen, Ye Yuan, Xueshan Luo

^*此作品的通讯作者

科研成果: 期刊稿件 › 文章 › 同行评审

157 引用（Scopus）

摘要

A Bloom filter is an effective, space-efficient data structure for concisely representing a set, and supporting approximate membership queries. Traditionally, the Bloom filter and its variants just focus on how to represent a static set and decrease the false positive probability to a sufficiently low level. By investigating mainstream applications based on the Bloom filter, we reveal that dynamic data sets are more common and important than static sets. However, existing variants of the Bloom filter cannot support dynamic data sets well. To address this issue, we propose dynamic Bloom filters to represent dynamic sets, as well as static sets and design necessary item insertion, membership query, item deletion, and filter union algorithms. The dynamic Bloom filter can control the false positive probability at a low level by expanding its capacity as the set cardinality increases. Through comprehensive mathematical analysis, we show that the dynamic Bloom filter uses less expected memory than the Bloom filter when representing dynamic sets with an upper bound on set cardinality, and also that the dynamic Bloom filter is more stable than the Bloom filter due to infrequent reconstruction when addressing dynamic sets without an upper bound on set cardinality. Moreover, the analysis results hold in stand-alone applications, as well as distributed applications.

源语言	英语
文章编号	4796196
页（从-至）	120-133
页数	14
期刊	IEEE Transactions on Knowledge and Data Engineering
卷	22
期	1
DOI	https://doi.org/10.1109/TKDE.2009.57
出版状态	已出版 - 1月 2010
已对外发布	是

访问文件

10.1109/TKDE.2009.57

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{c3bee4de9bd643f4a36a9bc145fcb611,

title = "The dynamic bloom filters",

abstract = "A Bloom filter is an effective, space-efficient data structure for concisely representing a set, and supporting approximate membership queries. Traditionally, the Bloom filter and its variants just focus on how to represent a static set and decrease the false positive probability to a sufficiently low level. By investigating mainstream applications based on the Bloom filter, we reveal that dynamic data sets are more common and important than static sets. However, existing variants of the Bloom filter cannot support dynamic data sets well. To address this issue, we propose dynamic Bloom filters to represent dynamic sets, as well as static sets and design necessary item insertion, membership query, item deletion, and filter union algorithms. The dynamic Bloom filter can control the false positive probability at a low level by expanding its capacity as the set cardinality increases. Through comprehensive mathematical analysis, we show that the dynamic Bloom filter uses less expected memory than the Bloom filter when representing dynamic sets with an upper bound on set cardinality, and also that the dynamic Bloom filter is more stable than the Bloom filter due to infrequent reconstruction when addressing dynamic sets without an upper bound on set cardinality. Moreover, the analysis results hold in stand-alone applications, as well as distributed applications.",

keywords = "Bloom filters, Dynamic Bloom filters, Information representation.",

author = "Deke Guo and Jie Wu and Honghui Chen and Ye Yuan and Xueshan Luo",

year = "2010",

month = jan,

doi = "10.1109/TKDE.2009.57",

language = "English",

volume = "22",

pages = "120--133",

journal = "IEEE Transactions on Knowledge and Data Engineering",

issn = "1041-4347",

publisher = "IEEE Computer Society",

number = "1",

}

TY - JOUR

T1 - The dynamic bloom filters

AU - Guo, Deke

AU - Wu, Jie

AU - Chen, Honghui

AU - Yuan, Ye

AU - Luo, Xueshan

PY - 2010/1

Y1 - 2010/1

N2 - A Bloom filter is an effective, space-efficient data structure for concisely representing a set, and supporting approximate membership queries. Traditionally, the Bloom filter and its variants just focus on how to represent a static set and decrease the false positive probability to a sufficiently low level. By investigating mainstream applications based on the Bloom filter, we reveal that dynamic data sets are more common and important than static sets. However, existing variants of the Bloom filter cannot support dynamic data sets well. To address this issue, we propose dynamic Bloom filters to represent dynamic sets, as well as static sets and design necessary item insertion, membership query, item deletion, and filter union algorithms. The dynamic Bloom filter can control the false positive probability at a low level by expanding its capacity as the set cardinality increases. Through comprehensive mathematical analysis, we show that the dynamic Bloom filter uses less expected memory than the Bloom filter when representing dynamic sets with an upper bound on set cardinality, and also that the dynamic Bloom filter is more stable than the Bloom filter due to infrequent reconstruction when addressing dynamic sets without an upper bound on set cardinality. Moreover, the analysis results hold in stand-alone applications, as well as distributed applications.

AB - A Bloom filter is an effective, space-efficient data structure for concisely representing a set, and supporting approximate membership queries. Traditionally, the Bloom filter and its variants just focus on how to represent a static set and decrease the false positive probability to a sufficiently low level. By investigating mainstream applications based on the Bloom filter, we reveal that dynamic data sets are more common and important than static sets. However, existing variants of the Bloom filter cannot support dynamic data sets well. To address this issue, we propose dynamic Bloom filters to represent dynamic sets, as well as static sets and design necessary item insertion, membership query, item deletion, and filter union algorithms. The dynamic Bloom filter can control the false positive probability at a low level by expanding its capacity as the set cardinality increases. Through comprehensive mathematical analysis, we show that the dynamic Bloom filter uses less expected memory than the Bloom filter when representing dynamic sets with an upper bound on set cardinality, and also that the dynamic Bloom filter is more stable than the Bloom filter due to infrequent reconstruction when addressing dynamic sets without an upper bound on set cardinality. Moreover, the analysis results hold in stand-alone applications, as well as distributed applications.

KW - Bloom filters

KW - Dynamic Bloom filters

KW - Information representation.

UR - http://www.scopus.com/inward/record.url?scp=72949117506&partnerID=8YFLogxK

U2 - 10.1109/TKDE.2009.57

DO - 10.1109/TKDE.2009.57

M3 - Article

AN - SCOPUS:72949117506

SN - 1041-4347

VL - 22

SP - 120

EP - 133

JO - IEEE Transactions on Knowledge and Data Engineering

JF - IEEE Transactions on Knowledge and Data Engineering

IS - 1

M1 - 4796196

ER -

The dynamic bloom filters

摘要

访问文件

其它文件与链接

指纹

引用此