Outlier Summarization via Human Interpretable Rules

Yuhao Deng, Yu Wang, Lei Cao, Lianpeng Qiao*, Yuping Wang, Jingzhe Xu, Yizhou Yan, Samuel Madden

*此作品的通讯作者

科研成果: 期刊稿件会议文章同行评审

摘要

Outlier detection is crucial for preventing financial fraud, network intrusions, and device failures. Users often expect systems to automatically summarize and interpret outlier detection results to reduce human effort and convert outliers into actionable insights. However, existing methods fail to effectively assist users in identifying the root causes of outliers, as they only pinpoint data attributes without considering outliers in the same subspace may have different causes. To fill this gap, we propose STAIR, which learns concise and human-understandable rules to summarize and explain outlier detection results with finer granularity. These rules consider both attributes and associated values. STAIR employs an interpretationaware optimization objective to generate a small number of rules with minimal complexity for strong interpretability. The learning algorithm of STAIR produces a rule set by iteratively splitting the large rules and is optimal in maximizing this objective in each iteration. Moreover, to effectively handle high dimensional, highly complex data sets that are hard to summarize with simple rules, we propose a localized STAIR approach, called L-STAIR. Taking data locality into consideration, it simultaneously partitions data and learns a set of localized rules for each partition. Our experimental study on many outlier benchmark datasets shows that STAIR significantly reduces the complexity of the rules required to summarize the outlier detection results, thus more amenable for humans to understand and evaluate.

源语言英语
页(从-至)1591-1604
页数14
期刊Proceedings of the VLDB Endowment
17
7
DOI
出版状态已出版 - 2024
活动50th International Conference on Very Large Data Bases, VLDB 2024 - Guangzhou, 中国
期限: 24 8月 202429 8月 2024

指纹

探究 'Outlier Summarization via Human Interpretable Rules' 的科研主题。它们共同构成独一无二的指纹。

引用此