TY - JOUR
T1 - 基于鲸群优化随机森林算法的非平衡数据分类
AU - Ye, Lizhu
AU - Zheng, Donghua
AU - Liu, Yuehong
AU - Niu, Shaohua
N1 - Publisher Copyright:
© 2022 Journal of Nanjing Institute of Posts and Telecommunications. All rights reserved.
PY - 2022/12
Y1 - 2022/12
N2 - In order to improve the accuracy of unbalanced data classification, the random forest algorithm is used for data classification, and the whale optimization algorithm is adoped to optimize the key parameters of the random forest, thus the adaptability of the random forest algorithm to unbalanced data classification is enhanced. First, the unbalanced data classification model is developed based on the random forest. The classification difficulties caused by sample imbalance are effectively solved through multiple decision tree weak classifiers of the random forest. Second, the whale swarm optimization algorithm is deployed to optimize the weight of weak classifiers, and the average classification accuracy is taken as the fitness function of the whale swarm optimization. Thus the accuracy of the weak classifier weight voting on the final classification results. Finally, the random forest model optimized by the whale population is used to classify the unbalanced data. Experiments show that by reasonably setting the parameters of the whale swarm optimization algorithm, the weight of random forest weak classifiers with higher classification accuracy can be obtained. Compared with the unbalanced data classification algorithms, this algorithm can obtain better classification performance.
AB - In order to improve the accuracy of unbalanced data classification, the random forest algorithm is used for data classification, and the whale optimization algorithm is adoped to optimize the key parameters of the random forest, thus the adaptability of the random forest algorithm to unbalanced data classification is enhanced. First, the unbalanced data classification model is developed based on the random forest. The classification difficulties caused by sample imbalance are effectively solved through multiple decision tree weak classifiers of the random forest. Second, the whale swarm optimization algorithm is deployed to optimize the weight of weak classifiers, and the average classification accuracy is taken as the fitness function of the whale swarm optimization. Thus the accuracy of the weak classifier weight voting on the final classification results. Finally, the random forest model optimized by the whale population is used to classify the unbalanced data. Experiments show that by reasonably setting the parameters of the whale swarm optimization algorithm, the weight of random forest weak classifiers with higher classification accuracy can be obtained. Compared with the unbalanced data classification algorithms, this algorithm can obtain better classification performance.
KW - decision tree
KW - random forest
KW - unbalanced data classification
KW - weak classifier
KW - whale swarm optimization algorithm
UR - http://www.scopus.com/inward/record.url?scp=85153231374&partnerID=8YFLogxK
U2 - 10.14132/j.cnki.1673-5439.2022.06.012
DO - 10.14132/j.cnki.1673-5439.2022.06.012
M3 - 文章
AN - SCOPUS:85153231374
SN - 1673-5439
VL - 42
SP - 99
EP - 105
JO - Nanjing Youdian Daxue Xuebao (Ziran Kexue Ban)/Journal of Nanjing University of Posts and Telecommunications (Natural Science)
JF - Nanjing Youdian Daxue Xuebao (Ziran Kexue Ban)/Journal of Nanjing University of Posts and Telecommunications (Natural Science)
IS - 6
ER -