TY - JOUR
T1 - Contrastive learning enhanced deep neural network with serial regularization for high-dimensional tabular data
AU - Wu, Yao
AU - Zhu, Donghua
AU - Wang, Xuefeng
N1 - Publisher Copyright:
© 2023 Elsevier Ltd
PY - 2023/10/15
Y1 - 2023/10/15
N2 - As the scale of data shows rapid growth in various fields, big data's vast amount of information can facilitate scientific discovery or decision-making. Deep neural network prevails in modeling big data such as images and text in computer vision and natural language processing. However, there is currently no widespread deep neural network for high-dimensional tabular data (HTD), as HTD could increase the model's complexity and make estimating the parameters more difficult. Therefore, this paper proposes CLDNSR, a contrastive learning-enhanced deep neural network with serial regularization. This method combines relaxed Bernoulli distribution-based L0 regularization and adaptive L2 regularization for important feature selection and adaptive redundancy control to effectively handle high-dimensional input features. In addition, a tabular contrastive pre-training method is proposed to stabilize the supervised training process through better parameter initialization. Experiments on eleven real-world high-dimensional tabular datasets demonstrate that CLDNSR outperforms the baseline models designed for high-dimensional data.
AB - As the scale of data shows rapid growth in various fields, big data's vast amount of information can facilitate scientific discovery or decision-making. Deep neural network prevails in modeling big data such as images and text in computer vision and natural language processing. However, there is currently no widespread deep neural network for high-dimensional tabular data (HTD), as HTD could increase the model's complexity and make estimating the parameters more difficult. Therefore, this paper proposes CLDNSR, a contrastive learning-enhanced deep neural network with serial regularization. This method combines relaxed Bernoulli distribution-based L0 regularization and adaptive L2 regularization for important feature selection and adaptive redundancy control to effectively handle high-dimensional input features. In addition, a tabular contrastive pre-training method is proposed to stabilize the supervised training process through better parameter initialization. Experiments on eleven real-world high-dimensional tabular datasets demonstrate that CLDNSR outperforms the baseline models designed for high-dimensional data.
KW - Contrastive learning
KW - Deep neural network
KW - High-dimensional tabular data
KW - Serial regularization
UR - http://www.scopus.com/inward/record.url?scp=85159368483&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2023.120243
DO - 10.1016/j.eswa.2023.120243
M3 - Article
AN - SCOPUS:85159368483
SN - 0957-4174
VL - 228
JO - Expert Systems with Applications
JF - Expert Systems with Applications
M1 - 120243
ER -