A Survey of Approximate Quantile Computation on Large-Scale Data

Zhiwei Chen, Aoqian Zhang*

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

13 引用 (Scopus)

摘要

As data volume grows extensively, data profiling helps to extract metadata of large-scale data. However, one kind of metadata, order statistics, is difficult to be computed because they are not mergeable or incremental. Thus, the limitation of time and memory space does not support their computation on large-scale data. In this paper, we focus on an order statistic, quantiles, and present a comprehensive analysis of studies on approximate quantile computation. Both deterministic algorithms and randomized algorithms that compute approximate quantiles over streaming models or distributed models are covered. Then, multiple techniques for improving the efficiency and performance of approximate quantile algorithms in various scenarios, such as skewed data and high-speed data streams, are presented. Finally, we conclude with coverage of existing packages in different languages and with a brief discussion of the future direction in this area.

源语言英语
文章编号9001104
页(从-至)34585-34597
页数13
期刊IEEE Access
8
DOI
出版状态已出版 - 2020
已对外发布

指纹

探究 'A Survey of Approximate Quantile Computation on Large-Scale Data' 的科研主题。它们共同构成独一无二的指纹。

引用此