Performance enhancement-based active learning sample selection method

Zhonghai He*, Shijie Song, Kun Shen, Xiaofang Zhang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

4 Citations (Scopus)

Abstract

Representative samples are important for multivariate calibration. The highly efficient selection of representative samples to be labelled can save money and time. Existing methods, such as Kennard-Stone and net analyte signal selection, are usually based on the distance between candidate samples and labelled calibration sets in feature space. However, these distances are influenced by the feature space, which is spanned by an information vector extracted from labelled samples. To overcome the negative effects of the distance-based selection method, a model performance enhancement-based sample selection method is proposed to select calibration samples efficiently. Based on loss function optimization, the samples that can improve model performance the most, as estimated by bootstrap, are sequentially selected and added to the calibration set. Due to the high representation of each sample, a few samples can build a model that has no significant loss of prediction ability when compared with a model built with the large number set of calibration samples. The performance enhancement-based active learning (PEAL) sample selection method is both effective and efficient.

Original languageEnglish
Article numbere3386
JournalJournal of Chemometrics
Volume36
Issue number3
DOIs
Publication statusPublished - Mar 2022

Keywords

  • bootstrap modelling
  • feature space
  • parsimonious sample selection
  • performance enhancement
  • set representation

Fingerprint

Dive into the research topics of 'Performance enhancement-based active learning sample selection method'. Together they form a unique fingerprint.

Cite this