Abstract
Limited by the lack of training spectral data in different kinds of tissues, the diagnostic accuracy of laser-induced breakdown spectroscopy (LIBS) is hard to reach the desired level with normal supervised learning identification methods. In this paper, we proposed to apply the predictive data clustering methods with supervised learning methods together to identify tissue information accurately. The meanshift clustering method is introduced to compare with three other clustering methods which have been used in LIBS field. We proposed the cluster precision (CP) score as a new criterion to work with Calinski-Harabasz (CH) score together for the evaluation of the clustering effect. The influences of principal component analysis (PCA) on all four kinds of clustering methods are also analyzed. PCA-meanshift shows the best clustering effect based on the comprehensive evaluation combined CH and CP scores. Based on the spatial location and feature similarity information provided by the predictive clustering, the PCA-Meanshift can improve diagnosis accuracy from less than 95% to 100% for all classifiers including support vector machine (SVM), k nearest neighbor (k-NN), soft independent modeling of class analogy (Simca) and random forests (RF) models.
Original language | English |
---|---|
Pages (from-to) | 4438-4451 |
Number of pages | 14 |
Journal | Biomedical Optics Express |
Volume | 12 |
Issue number | 7 |
DOIs | |
Publication status | Published - 1 Jul 2021 |