TY - JOUR
T1 - BigDataBench
T2 - An open-source big data benchmark suite
AU - Zhan, Jian Feng
AU - Gao, Wan Ling
AU - Wang, Lei
AU - Li, Jing Wei
AU - Wei, Kai
AU - Luo, Chun Jie
AU - Han, Rui
AU - Tian, Xin Hui
AU - Jiang, Chun Yu
N1 - Publisher Copyright:
© 2016, Science Press. All right reserved.
PY - 2016/1/1
Y1 - 2016/1/1
N2 - Booming big data sparks tremendous outpouring of interest in storing and processing these data, and consequently a variety of big data systems emerge, giving rise to great pressure on big data benchmarking. However, complexity and diversity of big data raise great challenges in big data benchmarking. Most of the related benchmark efforts either target at specific application domains and software stacks, or choose workloads subjectively according to so-called popularity, thus fail to cover the diversity and complexity of big data. In this paper, we discuss the requirements for big data benchmarking and present our open source big data benchmark suite-BigDataBench, which is a multi-discipline research and engineering effort, i. e. system, architecture, and data management. BigDataBench adopts an iterative and incremental methodology, not only covering five representative application domains, but also containing diverse data models and workload types. Currently, it includes 14 real-world data sets, scalable data generation tools for 3 kinds of data types, and 33 workloads implemented using competitive technologies. BigDataBench has been used both in academia and industry, with typical use cases of workload characterization, architecture design and system optimization. Based on BigDataBench, Chinese Academy of Information and Communications releases China's first industry-standard big data benchmark suite together with ICT, CAS, Huawei and other well-known companies and research institutions.
AB - Booming big data sparks tremendous outpouring of interest in storing and processing these data, and consequently a variety of big data systems emerge, giving rise to great pressure on big data benchmarking. However, complexity and diversity of big data raise great challenges in big data benchmarking. Most of the related benchmark efforts either target at specific application domains and software stacks, or choose workloads subjectively according to so-called popularity, thus fail to cover the diversity and complexity of big data. In this paper, we discuss the requirements for big data benchmarking and present our open source big data benchmark suite-BigDataBench, which is a multi-discipline research and engineering effort, i. e. system, architecture, and data management. BigDataBench adopts an iterative and incremental methodology, not only covering five representative application domains, but also containing diverse data models and workload types. Currently, it includes 14 real-world data sets, scalable data generation tools for 3 kinds of data types, and 33 workloads implemented using competitive technologies. BigDataBench has been used both in academia and industry, with typical use cases of workload characterization, architecture design and system optimization. Based on BigDataBench, Chinese Academy of Information and Communications releases China's first industry-standard big data benchmark suite together with ICT, CAS, Huawei and other well-known companies and research institutions.
KW - Benchmarking methodology
KW - Benchmarks
KW - Big data
KW - Data generation
KW - Industry standard
KW - Use cases
UR - http://www.scopus.com/inward/record.url?scp=84957032939&partnerID=8YFLogxK
U2 - 10.11897/SP.J.1016.2016.00196
DO - 10.11897/SP.J.1016.2016.00196
M3 - Article
AN - SCOPUS:84957032939
SN - 0254-4164
VL - 39
SP - 196
EP - 211
JO - Jisuanji Xuebao/Chinese Journal of Computers
JF - Jisuanji Xuebao/Chinese Journal of Computers
IS - 1
ER -