On big data benchmarking

Rui Han*, Lu Xiaoyi, Xu jiangtao

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

16 Citations (Scopus)

Abstract

Big data systems address the challenges of capturing, storing, managing, analyzing, and visualizing big data. Within this context, developing benchmarks to evaluate and compare big data systems has become an active topic for both research and industry communities. To date, most of the state-of-the-art big data benchmarks are designed for specific types of systems. Based on our experience, however, we argue that considering the complexity, diversity, and rapid evolution of big data systems, for the sake of fairness, big data benchmarks must include diversity of data and workloads. Given this motivation, in this paper, we first propose the key requirements and challenges in developing big data benchmarks from the perspectives of generating data with 4V properties (i.e. volume, velocity, variety and veracity) of big data, as well as generating tests with comprehensive workloads for big data systems. We then present the methodology on big data benchmarking designed to address these challenges. Next, the state-of-the-art are summarized and compared, following by our vision for future research directions.

Original languageEnglish
Title of host publicationBig Data Benchmarks, Performance Optimization, and Emerging Hardware - 4th and 5th Workshops, BPOE 2014, Revised Selected Papers
EditorsJianfeng Zhan, Rui Han, Rui Han, Chuliang Weng
PublisherSpringer Verlag
Pages3-18
Number of pages16
ISBN (Electronic)9783319130200
DOIs
Publication statusPublished - 2014
Externally publishedYes

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8807
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Keywords

  • Benchmark
  • Big data systems
  • Data
  • Tests

Fingerprint

Dive into the research topics of 'On big data benchmarking'. Together they form a unique fingerprint.

Cite this

Han, R., Xiaoyi, L., & jiangtao, X. (2014). On big data benchmarking. In J. Zhan, R. Han, R. Han, & C. Weng (Eds.), Big Data Benchmarks, Performance Optimization, and Emerging Hardware - 4th and 5th Workshops, BPOE 2014, Revised Selected Papers (pp. 3-18). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 8807). Springer Verlag. https://doi.org/10.1007/978-3-319-13021-7_1