SPAE: Lifelong disk failure prediction via end-to-end GAN-based anomaly detection with ensemble update

Yu Liu, Yunchuan Guan, Tianming Jiang*, Ke Zhou, Hua Wang, Guangxing Hu, Ji Zhang, Wei Fang, Zhuo Cheng, Ping Huang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

8 Citations (Scopus)

Abstract

Disk failure prediction aims to predict upcoming disk failures in advance for high data reliability. There are numerous supervised machine learning methods that are successful in predicting disk failure using SMART properties as input. However, these approaches heavily rely on a substantial number of annotated failed disks, resulting in degraded prediction performance caused by scarce failed disks at the beginning, also known as the cold start problem. Inspired by the success achieved in Generative Adversarial Network (GAN) based anomaly detection, this paper translates disk failure prediction into an anomaly detection problem. Specifically, we developed a Semi-supervised method for lifelong disk failure Prediction via Adversarial training and Ensemble update, called SPAE. The advantage of SPAE over existing supervised approaches is that SPAE can train the prediction model using only healthy disks, avoiding the cold start problem. Furthermore, SPAE can be updated using ensemble learning on emerging failed disks to resist the model aging problem. Compared to state-of-the-art methods using supervised machine learning on real-world datasets, SPAE predicts disk failures with higher accuracy for the full lifetime of models, i.e., both the startup period and the long-term usage.

Original languageEnglish
Pages (from-to)460-471
Number of pages12
JournalFuture Generation Computer Systems
Volume148
DOIs
Publication statusPublished - Nov 2023
Externally publishedYes

Keywords

  • Adversarial training
  • Anomaly detection
  • Data reliability
  • Disk failure
  • Ensemble update
  • SMART

Fingerprint

Dive into the research topics of 'SPAE: Lifelong disk failure prediction via end-to-end GAN-based anomaly detection with ensemble update'. Together they form a unique fingerprint.

Cite this