Abstract
System logs, recording critical information about system operations, serve as indispensable tools for system anomaly detection. Graph-based methods have demonstrated superior performance compared to other methods in capturing the interdependencies of log events. However, existing methods often neglect the complex substructure patterns of nodes within log graphs, making it challenging to capture the subtle alteration in event type, structure, and the location of exceptions that indicate node anomalies. To address this limitation, this paper proposes a novel framework called Substructure-aware Log Anomaly Detection at Code File Level (SLAD). It first introduces a Monte Carlo Tree Search strategy tailored specifically for log anomaly detection to discover representative substructures. Then, SLAD incorporates a substructure distillation way to enhance the efficiency of anomaly inference based on the representative substructures. After that, we introduce a soft pruning to obtain key substructure for nodes. Experimental results show SLAD outperforms all baselines. Particularly, SLAD demonstrates at least 15 times faster than substructure-based graph learning methods in anomaly inference.
| Original language | English |
|---|---|
| Pages (from-to) | 213-225 |
| Number of pages | 13 |
| Journal | Proceedings of the VLDB Endowment |
| Volume | 18 |
| Issue number | 2 |
| DOIs | |
| Publication status | Published - 2025 |
| Externally published | Yes |
| Event | 51st International Conference on Very Large Data Bases, VLDB 2025 - London, United Kingdom Duration: 1 Sept 2025 → 5 Sept 2025 |