Time series data cleaning: From anomaly detection to anomaly repairing

Aoqian Zhang, Shaoxu Song, Jianmin Wang, Philip S. Yu

Research output: Contribution to journalConference articlepeer-review

113 Citations (Scopus)

Abstract

Errors are prevalent in time series data, such as GPS trajectories or sensor readings. Existing methods focus more on anomaly detection but not on repairing the detected anomalies. By simply filtering out the dirty data via anomaly detection, applications could still be unreliable over the incomplete time series. Instead of simply discarding anomalies, we propose to (iteratively) repair them in time series data, by creatively bonding the beauty of temporal nature in anomaly detection with the widely considered minimum change principle in data repairing. Our major contributions include: (1) a novel framework of iterative minimum repairing (IMR) over time series data, (2) explicit analysis on convergence of the proposed iterative minimum repairing, and (3) efficient estimation of parameters in each iteration. Remarkably, with incremental computation, we reduce the complexity of parameter estimation from O(n) to O(1). Experiments on real datasets demonstrate the superiority of our proposal compared to the state-of-the-art approaches. In particular, we show that (the proposed) repairing indeed improves the time series classification application.

Original languageEnglish
Pages (from-to)1046-1057
Number of pages12
JournalProceedings of the VLDB Endowment
Volume10
Issue number10
DOIs
Publication statusPublished - 1 Jun 2017
Externally publishedYes
Event43rd International Conference on Very Large Data Bases, VLDB 2017 - Munich, Germany
Duration: 28 Aug 20171 Sept 2017

Fingerprint

Dive into the research topics of 'Time series data cleaning: From anomaly detection to anomaly repairing'. Together they form a unique fingerprint.

Cite this