BigDataBench: An open-source big data benchmark suite

Jian Feng Zhan; Wan Ling Gao; Lei Wang; Jing Wei Li; Kai Wei; Chun Jie Luo; Rui Han; Xin Hui Tian; Chun Yu Jiang

doi:10.11897/SP.J.1016.2016.00196

BigDataBench: An open-source big data benchmark suite

Jian Feng Zhan, Wan Ling Gao, Lei Wang, Jing Wei Li, Kai Wei, Chun Jie Luo, Rui Han, Xin Hui Tian, Chun Yu Jiang

Research output: Contribution to journal › Article › peer-review

13 Citations (Scopus)

Abstract

Booming big data sparks tremendous outpouring of interest in storing and processing these data, and consequently a variety of big data systems emerge, giving rise to great pressure on big data benchmarking. However, complexity and diversity of big data raise great challenges in big data benchmarking. Most of the related benchmark efforts either target at specific application domains and software stacks, or choose workloads subjectively according to so-called popularity, thus fail to cover the diversity and complexity of big data. In this paper, we discuss the requirements for big data benchmarking and present our open source big data benchmark suite-BigDataBench, which is a multi-discipline research and engineering effort, i. e. system, architecture, and data management. BigDataBench adopts an iterative and incremental methodology, not only covering five representative application domains, but also containing diverse data models and workload types. Currently, it includes 14 real-world data sets, scalable data generation tools for 3 kinds of data types, and 33 workloads implemented using competitive technologies. BigDataBench has been used both in academia and industry, with typical use cases of workload characterization, architecture design and system optimization. Based on BigDataBench, Chinese Academy of Information and Communications releases China's first industry-standard big data benchmark suite together with ICT, CAS, Huawei and other well-known companies and research institutions.

Original language	English
Pages (from-to)	196-211
Number of pages	16
Journal	Jisuanji Xuebao/Chinese Journal of Computers
Volume	39
Issue number	1
DOIs	https://doi.org/10.11897/SP.J.1016.2016.00196
Publication status	Published - 1 Jan 2016
Externally published	Yes

Keywords

Benchmarking methodology
Benchmarks
Big data
Data generation
Industry standard
Use cases

Access to Document

10.11897/SP.J.1016.2016.00196

Cite this

Zhan, J. F., Gao, W. L., Wang, L., Li, J. W., Wei, K., Luo, C. J., Han, R., Tian, X. H., & Jiang, C. Y. (2016). BigDataBench: An open-source big data benchmark suite. Jisuanji Xuebao/Chinese Journal of Computers, 39(1), 196-211. https://doi.org/10.11897/SP.J.1016.2016.00196

@article{6df0d085271947d69076e314f88346bd,

title = "BigDataBench: An open-source big data benchmark suite",

abstract = "Booming big data sparks tremendous outpouring of interest in storing and processing these data, and consequently a variety of big data systems emerge, giving rise to great pressure on big data benchmarking. However, complexity and diversity of big data raise great challenges in big data benchmarking. Most of the related benchmark efforts either target at specific application domains and software stacks, or choose workloads subjectively according to so-called popularity, thus fail to cover the diversity and complexity of big data. In this paper, we discuss the requirements for big data benchmarking and present our open source big data benchmark suite-BigDataBench, which is a multi-discipline research and engineering effort, i. e. system, architecture, and data management. BigDataBench adopts an iterative and incremental methodology, not only covering five representative application domains, but also containing diverse data models and workload types. Currently, it includes 14 real-world data sets, scalable data generation tools for 3 kinds of data types, and 33 workloads implemented using competitive technologies. BigDataBench has been used both in academia and industry, with typical use cases of workload characterization, architecture design and system optimization. Based on BigDataBench, Chinese Academy of Information and Communications releases China's first industry-standard big data benchmark suite together with ICT, CAS, Huawei and other well-known companies and research institutions.",

keywords = "Benchmarking methodology, Benchmarks, Big data, Data generation, Industry standard, Use cases",

author = "Zhan, {Jian Feng} and Gao, {Wan Ling} and Lei Wang and Li, {Jing Wei} and Kai Wei and Luo, {Chun Jie} and Rui Han and Tian, {Xin Hui} and Jiang, {Chun Yu}",

year = "2016",

month = jan,

day = "1",

doi = "10.11897/SP.J.1016.2016.00196",

language = "English",

volume = "39",

pages = "196--211",

journal = "Jisuanji Xuebao/Chinese Journal of Computers",

issn = "0254-4164",

publisher = "Science Press",

number = "1",

}

TY - JOUR

T1 - BigDataBench

T2 - An open-source big data benchmark suite

AU - Zhan, Jian Feng

AU - Gao, Wan Ling

AU - Wang, Lei

AU - Li, Jing Wei

AU - Wei, Kai

AU - Luo, Chun Jie

AU - Han, Rui

AU - Tian, Xin Hui

AU - Jiang, Chun Yu

PY - 2016/1/1

Y1 - 2016/1/1

N2 - Booming big data sparks tremendous outpouring of interest in storing and processing these data, and consequently a variety of big data systems emerge, giving rise to great pressure on big data benchmarking. However, complexity and diversity of big data raise great challenges in big data benchmarking. Most of the related benchmark efforts either target at specific application domains and software stacks, or choose workloads subjectively according to so-called popularity, thus fail to cover the diversity and complexity of big data. In this paper, we discuss the requirements for big data benchmarking and present our open source big data benchmark suite-BigDataBench, which is a multi-discipline research and engineering effort, i. e. system, architecture, and data management. BigDataBench adopts an iterative and incremental methodology, not only covering five representative application domains, but also containing diverse data models and workload types. Currently, it includes 14 real-world data sets, scalable data generation tools for 3 kinds of data types, and 33 workloads implemented using competitive technologies. BigDataBench has been used both in academia and industry, with typical use cases of workload characterization, architecture design and system optimization. Based on BigDataBench, Chinese Academy of Information and Communications releases China's first industry-standard big data benchmark suite together with ICT, CAS, Huawei and other well-known companies and research institutions.

AB - Booming big data sparks tremendous outpouring of interest in storing and processing these data, and consequently a variety of big data systems emerge, giving rise to great pressure on big data benchmarking. However, complexity and diversity of big data raise great challenges in big data benchmarking. Most of the related benchmark efforts either target at specific application domains and software stacks, or choose workloads subjectively according to so-called popularity, thus fail to cover the diversity and complexity of big data. In this paper, we discuss the requirements for big data benchmarking and present our open source big data benchmark suite-BigDataBench, which is a multi-discipline research and engineering effort, i. e. system, architecture, and data management. BigDataBench adopts an iterative and incremental methodology, not only covering five representative application domains, but also containing diverse data models and workload types. Currently, it includes 14 real-world data sets, scalable data generation tools for 3 kinds of data types, and 33 workloads implemented using competitive technologies. BigDataBench has been used both in academia and industry, with typical use cases of workload characterization, architecture design and system optimization. Based on BigDataBench, Chinese Academy of Information and Communications releases China's first industry-standard big data benchmark suite together with ICT, CAS, Huawei and other well-known companies and research institutions.

KW - Benchmarking methodology

KW - Benchmarks

KW - Big data

KW - Data generation

KW - Industry standard

KW - Use cases

UR - http://www.scopus.com/inward/record.url?scp=84957032939&partnerID=8YFLogxK

U2 - 10.11897/SP.J.1016.2016.00196

DO - 10.11897/SP.J.1016.2016.00196

M3 - Article

AN - SCOPUS:84957032939

SN - 0254-4164

VL - 39

SP - 196

EP - 211

JO - Jisuanji Xuebao/Chinese Journal of Computers

JF - Jisuanji Xuebao/Chinese Journal of Computers

IS - 1

ER -

BigDataBench: An open-source big data benchmark suite

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this