Online GBDT with Chunk Dynamic Weighted Majority Learners for Noisy and Drifting Data Streams

Senlin Luo, Weixiao Zhao, Limin Pan*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

In the field of data mining, data stream mining has become one of the research focuses, in which noise and concept drift are two main challenges. Generally, sensitive classifiers are likely to over-fit noisy samples, while robust classifiers are prone to ignore concept drift in data streams. This contradictory raises a high requirement for online algorithms to respond to the above two challenges. In this paper, a Chunk Dynamic Weighted Majority module which processes samples chunk-by-chunk is combined with the online Gradient Boosting Decision Tree framework to cope with drifting data streams with noise. The method could discard weak classifiers which are not appropriate for the current concept distribution, and create new weak classifiers to adapt to drifting data streams. Besides, a robust Doom2 loss function is developed to address the noise sensitivity in stable data streams. The result of experiments demonstrates that compared with the state-of-the-art online algorithms, the proposed algorithm can obtain better results in both noisy stable data streams and drifting data streams.

Original languageEnglish
Pages (from-to)3783-3799
Number of pages17
JournalNeural Processing Letters
Volume53
Issue number5
DOIs
Publication statusPublished - Oct 2021

Keywords

  • Concept drift
  • Dynamic Weighted Majority
  • Loss function
  • Noise
  • online GBDT

Fingerprint

Dive into the research topics of 'Online GBDT with Chunk Dynamic Weighted Majority Learners for Noisy and Drifting Data Streams'. Together they form a unique fingerprint.

Cite this