Abstract
In the field of data mining, data stream mining has become one of the research focuses, in which noise and concept drift are two main challenges. Generally, sensitive classifiers are likely to over-fit noisy samples, while robust classifiers are prone to ignore concept drift in data streams. This contradictory raises a high requirement for online algorithms to respond to the above two challenges. In this paper, a Chunk Dynamic Weighted Majority module which processes samples chunk-by-chunk is combined with the online Gradient Boosting Decision Tree framework to cope with drifting data streams with noise. The method could discard weak classifiers which are not appropriate for the current concept distribution, and create new weak classifiers to adapt to drifting data streams. Besides, a robust Doom2 loss function is developed to address the noise sensitivity in stable data streams. The result of experiments demonstrates that compared with the state-of-the-art online algorithms, the proposed algorithm can obtain better results in both noisy stable data streams and drifting data streams.
Original language | English |
---|---|
Pages (from-to) | 3783-3799 |
Number of pages | 17 |
Journal | Neural Processing Letters |
Volume | 53 |
Issue number | 5 |
DOIs | |
Publication status | Published - Oct 2021 |
Keywords
- Concept drift
- Dynamic Weighted Majority
- Loss function
- Noise
- online GBDT