Effective temporal dependence discovery in time series data

Qingchao Cai, Zhongle Xie, Meihui Zhang*, Gang Chen, H. V. Jagadish, Beng Chin Ooi

*Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

8 Citations (Scopus)

Abstract

To analyze user behavior over time, it is useful to group users into cohorts, giving rise to cohort analysis. We identify several crucial limitations of current cohort analysis, motivated by the unmet need for temporal dependence discovery. To address these limitations, we propose a generalization that we call recurrent cohort analysis. We introduce a set of operators for recurrent cohort analysis and design access methods specific to these operators in both single-node and distributed environments. Through extensive experiments, we show that recurrent cohort analysis when implemented using the proposed access methods is up to six orders faster than one implemented as a layer on top of a database in a single-node setting, and two orders faster than one implemented using Spark SQL in a distributed setting.

Original languageEnglish
Pages (from-to)893-905
Number of pages13
JournalProceedings of the VLDB Endowment
Volume11
Issue number8
DOIs
Publication statusPublished - 2018
Event44th International Conference on Very Large Data Bases, VLDB 2018 - Rio de Janeiro, Brazil
Duration: 27 Aug 201831 Aug 2018

Fingerprint

Dive into the research topics of 'Effective temporal dependence discovery in time series data'. Together they form a unique fingerprint.

Cite this