A Measurable Framework for Run-time Data Sampling in Large-scale Datacenter

Hedong Yan, Shilin Wen, Rui Han

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In large-scale data center, collecting run-time data is a very effective method which can be used to analyze and monitor the performance of data centers. But due to the huge size of data centers, limited computing resources and the requirement of low delay, it is very difficult and unrealistic to collect all the data in large-scale data centers. Therefore, to solve the serious problem, sampling partial data from all data is a common method at present. However, existing researches only focus on designing some efficient data sampling methods to reduce resource and time overhead in datacenters, but these works do not provide a unified and measurable framework to quantity the quality and practicability of other sampling methods. In this paper, we propose a measurable framework for general run-time data sampling in large-scale data center by modeling underlying recovering hypothesis explicitly. The proposed framework is mainly composed of four processes: sampling, collecting, recovering, and comparing. It could measure sampling bias degree accurately. And we design and implement three sampling methods with different recovering hypothesis. The experimental results demonstrate that the proposed framework can help us find a better run-time data sampling method effectively which has a lower sampling bias degree with same sampling rate.

Original languageEnglish
Title of host publicationICSIDP 2019 - IEEE International Conference on Signal, Information and Data Processing 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728123455
DOIs
Publication statusPublished - Dec 2019
Event2019 IEEE International Conference on Signal, Information and Data Processing, ICSIDP 2019 - Chongqing, China
Duration: 11 Dec 201913 Dec 2019

Publication series

NameICSIDP 2019 - IEEE International Conference on Signal, Information and Data Processing 2019

Conference

Conference2019 IEEE International Conference on Signal, Information and Data Processing, ICSIDP 2019
Country/TerritoryChina
CityChongqing
Period11/12/1913/12/19

Keywords

  • large-scale datacenter
  • measurable framework
  • recovering hypothesis
  • run-time data collecting
  • sampling bias degree

Fingerprint

Dive into the research topics of 'A Measurable Framework for Run-time Data Sampling in Large-scale Datacenter'. Together they form a unique fingerprint.

Cite this