Speal: Achieving a More Accurate Model with Less Training Data in Performance Evaluation of Storage System through Sampling Optimization

Liang Bao, Hua Wang*, Ke Zhou, Guangyu Zhang, Ji Zhang, Xi Peng, Qingqing Yang, Renhai Chen, Gong Zhang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Performance evaluation, as a crucial component of Quality of Service (QoS), holds significant importance for modern storage systems. Previous machine learning-based methods ignore the varied improvements in the model after applying different datasets for training. Suboptimized random sampling methods may lead to the collection of unnecessary training data, resulting in excessively high dataset construction costs. This problem becomes more pronounced when there are constraints on the sampling and storage system resources. In this paper, we propose Speal, Storage System Performance Evaluator with Active Learning, which utilizes machine learning to predict the performance of the workload running on the storage system. We present a straightforward yet highly effective active learning algorithm called E2 sampling, employed during the model construction phase to reduce the cost of training dataset acquisition. Furthermore, we apply Speal to the storage system to facilitate bandwidth control and optimize performance. In our experiments using performance data collected from the real storage system, Speal exhibits up to 1.75x reduction in prediction error compared to other active learning algorithms. Additionally, implementing the bandwidth control enhanced by Speal ’s performance evaluation to the storage system leads to an average throughput improvement of up to 1.51x and a reduction in tail latency by up to 1.71x, surpassing the baseline.

Original languageEnglish
Title of host publicationDatabase Systems for Advanced Applications - 29th International Conference, DASFAA 2024, Proceedings
EditorsMakoto Onizuka, Jae-Gil Lee, Yongxin Tong, Chuan Xiao, Yoshiharu Ishikawa, Kejing Lu, Sihem Amer-Yahia, H.V. Jagadish
PublisherSpringer Science and Business Media Deutschland GmbH
Pages179-194
Number of pages16
ISBN (Print)9789819757787
DOIs
Publication statusPublished - 2025
Externally publishedYes
Event29th International Conference on Database Systems for Advanced Applications, DASFAA 2024 - Gifu, Japan
Duration: 2 Jul 20245 Jul 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14851 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference29th International Conference on Database Systems for Advanced Applications, DASFAA 2024
Country/TerritoryJapan
CityGifu
Period2/07/245/07/24

Keywords

  • Active Learning
  • Machine Learning
  • Performance Evaluation
  • Storage System

Fingerprint

Dive into the research topics of 'Speal: Achieving a More Accurate Model with Less Training Data in Performance Evaluation of Storage System through Sampling Optimization'. Together they form a unique fingerprint.

Cite this