TY - GEN
T1 - You Are How You Use
T2 - 29th ACM International Conference on Information and Knowledge Management, CIKM 2020
AU - Yang, Xiaodu
AU - Yi, Xiuwen
AU - Chen, Shun
AU - Ruan, Sijie
AU - Zhang, Junbo
AU - Zheng, Yu
AU - Li, Tianrui
N1 - Publisher Copyright:
© 2020 ACM.
PY - 2020/10/19
Y1 - 2020/10/19
N2 - Gas theft of restaurants is a major concern in the gas industry, which causes revenue losses for gas companies and endangers the public safety seriously. Traditional methods of gas theft detection highly rely on active human efforts that are extremely ineffective. Thanks to the gas consumption data collected by smart meters, we can devise a data-driven method to tackle this issue. In this paper, we propose a gas-theft detection method msRank to discover suspicious restaurant users when only scarce labels are available. Our method contains three main components: 1)data pre-processing, which filters reading noises and excludes data-missing or zero-use users; 2)normal user modeling, which quantifies the self-stable seasonality of normal users and distinguishes them from unstable ones; and 3)gas-theft suspect detection, which discovers gas-theft suspects among unstable users by RankNet-based suspicion scoring on extracted deviation features. By using detected normal users as negative samples to train RankNet, the component of normal user modeling and that of gas-theft suspect detection are seamlessly connected, overcoming the problem of label scarcity. We conduct extensive experiments on three real-world datasets, and the results demonstrate advantages of our approach. We have deployed a system GasShield which provides a gas-theft suspect list weekly for a gas group in northern China.
AB - Gas theft of restaurants is a major concern in the gas industry, which causes revenue losses for gas companies and endangers the public safety seriously. Traditional methods of gas theft detection highly rely on active human efforts that are extremely ineffective. Thanks to the gas consumption data collected by smart meters, we can devise a data-driven method to tackle this issue. In this paper, we propose a gas-theft detection method msRank to discover suspicious restaurant users when only scarce labels are available. Our method contains three main components: 1)data pre-processing, which filters reading noises and excludes data-missing or zero-use users; 2)normal user modeling, which quantifies the self-stable seasonality of normal users and distinguishes them from unstable ones; and 3)gas-theft suspect detection, which discovers gas-theft suspects among unstable users by RankNet-based suspicion scoring on extracted deviation features. By using detected normal users as negative samples to train RankNet, the component of normal user modeling and that of gas-theft suspect detection are seamlessly connected, overcoming the problem of label scarcity. We conduct extensive experiments on three real-world datasets, and the results demonstrate advantages of our approach. We have deployed a system GasShield which provides a gas-theft suspect list weekly for a gas group in northern China.
KW - gas theft detection
KW - non-technical losses
KW - time series anomaly detection
KW - urban computing
KW - utility fraud detection
UR - http://www.scopus.com/inward/record.url?scp=85095866415&partnerID=8YFLogxK
U2 - 10.1145/3340531.3412751
DO - 10.1145/3340531.3412751
M3 - Conference contribution
AN - SCOPUS:85095866415
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 2885
EP - 2892
BT - CIKM 2020 - Proceedings of the 29th ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery
Y2 - 19 October 2020 through 23 October 2020
ER -