Online GBDT with Chunk Dynamic Weighted Majority Learners for Noisy and Drifting Data Streams

Senlin Luo, Weixiao Zhao, Limin Pan*

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

摘要

In the field of data mining, data stream mining has become one of the research focuses, in which noise and concept drift are two main challenges. Generally, sensitive classifiers are likely to over-fit noisy samples, while robust classifiers are prone to ignore concept drift in data streams. This contradictory raises a high requirement for online algorithms to respond to the above two challenges. In this paper, a Chunk Dynamic Weighted Majority module which processes samples chunk-by-chunk is combined with the online Gradient Boosting Decision Tree framework to cope with drifting data streams with noise. The method could discard weak classifiers which are not appropriate for the current concept distribution, and create new weak classifiers to adapt to drifting data streams. Besides, a robust Doom2 loss function is developed to address the noise sensitivity in stable data streams. The result of experiments demonstrates that compared with the state-of-the-art online algorithms, the proposed algorithm can obtain better results in both noisy stable data streams and drifting data streams.

源语言英语
页(从-至)3783-3799
页数17
期刊Neural Processing Letters
53
5
DOI
出版状态已出版 - 10月 2021

指纹

探究 'Online GBDT with Chunk Dynamic Weighted Majority Learners for Noisy and Drifting Data Streams' 的科研主题。它们共同构成独一无二的指纹。

引用此