摘要
Streaming computing attracts intense attention because of the demand for massive data analyzing in real-time. Due to unbounded and continuous input, the volume of streaming data is so high that all the data cannot be permanently stored. Piecewise polynomial fitting is a popular data compression method that approximately represents the raw data stream with multiple polynomials. The polynomial coefficients corresponding to the best-fitting curve can be calculated by the method of least squares, which minimizes the sum of the squared residuals between observed and fitted values. However, built on several matrix calculations, the method of least squares always leads to high time complexity and is difficult to be applied to streaming computing. This paper puts forward a fast piecewise polynomial fitting for time-series data in streaming computing. The input data stream is dynamically segmented according to a given residual bound. Meanwhile, the data points in each segment are fitted using an improved polynomial fitting method, which has less time overhead than general polynomial fitting by reusing the intermediate calculation results. Experimental results on four time-series datasets show that our algorithm can achieve the highest speedup to the general piecewise polynomial fitting of 2.82 \times for periodically sampled time-series data and 1.85 \times for aperiodically sampled time-series data, without affecting the compression ratio and fitting accuracy. Moreover, the event-time latency comparison in a streaming environment indicates that the improved method can endure higher throughput than general piecewise polynomial fitting with the same latency.
源语言 | 英语 |
---|---|
文章编号 | 9016024 |
页(从-至) | 43764-43775 |
页数 | 12 |
期刊 | IEEE Access |
卷 | 8 |
DOI | |
出版状态 | 已出版 - 2020 |