BigDataBench: An open-source big data benchmark suite

Jian Feng Zhan, Wan Ling Gao, Lei Wang, Jing Wei Li, Kai Wei, Chun Jie Luo, Rui Han, Xin Hui Tian, Chun Yu Jiang

Research output: Contribution to journalArticlepeer-review

13 Citations (Scopus)

Abstract

Booming big data sparks tremendous outpouring of interest in storing and processing these data, and consequently a variety of big data systems emerge, giving rise to great pressure on big data benchmarking. However, complexity and diversity of big data raise great challenges in big data benchmarking. Most of the related benchmark efforts either target at specific application domains and software stacks, or choose workloads subjectively according to so-called popularity, thus fail to cover the diversity and complexity of big data. In this paper, we discuss the requirements for big data benchmarking and present our open source big data benchmark suite-BigDataBench, which is a multi-discipline research and engineering effort, i. e. system, architecture, and data management. BigDataBench adopts an iterative and incremental methodology, not only covering five representative application domains, but also containing diverse data models and workload types. Currently, it includes 14 real-world data sets, scalable data generation tools for 3 kinds of data types, and 33 workloads implemented using competitive technologies. BigDataBench has been used both in academia and industry, with typical use cases of workload characterization, architecture design and system optimization. Based on BigDataBench, Chinese Academy of Information and Communications releases China's first industry-standard big data benchmark suite together with ICT, CAS, Huawei and other well-known companies and research institutions.

Original languageEnglish
Pages (from-to)196-211
Number of pages16
JournalJisuanji Xuebao/Chinese Journal of Computers
Volume39
Issue number1
DOIs
Publication statusPublished - 1 Jan 2016
Externally publishedYes

Keywords

  • Benchmarking methodology
  • Benchmarks
  • Big data
  • Data generation
  • Industry standard
  • Use cases

Fingerprint

Dive into the research topics of 'BigDataBench: An open-source big data benchmark suite'. Together they form a unique fingerprint.

Cite this