Multidimensional Features Helping Predict Failures in Production SSD-Based Consumer Storage Systems

Xinyan Zhang, Zhipeng Tan*, Dan Feng*, Qiang He*, Wan Ju, Jiang Hao, Ji Zhang, Lihua Yang, Wenjie Qi

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

As SSD failures seriously lead to data loss and service interruption, proactive failure prediction is often used to improve system availability. However, the unidimensional SMART-based prediction models hardly predict all drive failures. Some other features applied in data centers and enterprise storage systems are not readily available in consumer storage systems (CSS). To further analyze related failures in production SSD-based CSS, we study nearly 2.3 million SSDs from 12 drive models based on a dataset of SMART logs, trouble tickets, and error logs. We discover that SMART, Firmware Version, WindowsEvent, and BlueScreenof Death (SFWB) are closely related to SSD failures. We further propose a multidimensional-based failure prediction approach (MFPA), which is portable in algorithms, SSD vendors, and PC manufacturers. Experiments on the datasets show that SFWB-based MFPA achieves a high true positive rate (98.18%) and low false positive rate (0.56%), which is 4% higher and 86% lower than the SMART-based model. It is robust and can con-tinuously predict for 2-3 months without iteration, substantially improving the system availability.

Original languageEnglish
Title of host publication2023 Design, Automation and Test in Europe Conference and Exhibition, DATE 2023 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9783981926378
DOIs
Publication statusPublished - 2023
Externally publishedYes
Event2023 Design, Automation and Test in Europe Conference and Exhibition, DATE 2023 - Antwerp, Belgium
Duration: 17 Apr 202319 Apr 2023

Publication series

NameProceedings -Design, Automation and Test in Europe, DATE
Volume2023-April
ISSN (Print)1530-1591

Conference

Conference2023 Design, Automation and Test in Europe Conference and Exhibition, DATE 2023
Country/TerritoryBelgium
CityAntwerp
Period17/04/2319/04/23

Keywords

  • failure prediction
  • machine learning
  • multidimensional features
  • SSD
  • system availability

Fingerprint

Dive into the research topics of 'Multidimensional Features Helping Predict Failures in Production SSD-Based Consumer Storage Systems'. Together they form a unique fingerprint.

Cite this