TY - JOUR
T1 - A model fusion method based on multi-source heterogeneous data for stock trading signal prediction
AU - Chen, Xi
AU - Hirota, Kaoru
AU - Dai, Yaping
AU - Jia, Zhiyang
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.
PY - 2023/5
Y1 - 2023/5
N2 - In the prediction of turning points (TPs) of time series, the improved model of integrating piecewise linear representation and weighted support vector machine (IPLR-WSVM) has achieved good performance. However, due to the single data source and the limitation of algorithm, IPLR-WSVM has encountered challenges in profitability. In this paper, a model fusion method based on multi-source heterogeneous data and different learning algorithms is proposed for the prediction of TPs (MF-MSHD). Multi-source heterogeneous data include weighted unstructured and structured information with different granularities. RF, WSVM, BPNN, GBDT, and LSTM are selected to be the learning algorithms. The differences among meta-models are constructed by different inputs and algorithms as much as possible, and a model fusion rule is designed to determine the final TPs. Moreover, the TPs are generated based on the characteristics of individual stock. For sentiment analysis, a more accurate sentiment dictionary of stock market comments is established. Specifically, the fine-grained data is introduced to jointly determine the accurate trading moment. The prediction level of the proposal improves the accuracy and profitability, and also outperforms the composite indexes. Experimental results show that the profit rate of randomly selected stocks in MF-MSHD reaches 0.5172, while the highest value is 0.2841 in single meta-model and 0.0992 in buy and hold strategy, respectively. The other indicators including the accuracy are also modified. Compared with the increases of 0.1648, 0.4051, and 0.3397 in Shanghai Composite Index, Shenzhen Composite Index, and CSI 300 Index, MF-MSHD shows higher profitability in stock trading signal prediction.
AB - In the prediction of turning points (TPs) of time series, the improved model of integrating piecewise linear representation and weighted support vector machine (IPLR-WSVM) has achieved good performance. However, due to the single data source and the limitation of algorithm, IPLR-WSVM has encountered challenges in profitability. In this paper, a model fusion method based on multi-source heterogeneous data and different learning algorithms is proposed for the prediction of TPs (MF-MSHD). Multi-source heterogeneous data include weighted unstructured and structured information with different granularities. RF, WSVM, BPNN, GBDT, and LSTM are selected to be the learning algorithms. The differences among meta-models are constructed by different inputs and algorithms as much as possible, and a model fusion rule is designed to determine the final TPs. Moreover, the TPs are generated based on the characteristics of individual stock. For sentiment analysis, a more accurate sentiment dictionary of stock market comments is established. Specifically, the fine-grained data is introduced to jointly determine the accurate trading moment. The prediction level of the proposal improves the accuracy and profitability, and also outperforms the composite indexes. Experimental results show that the profit rate of randomly selected stocks in MF-MSHD reaches 0.5172, while the highest value is 0.2841 in single meta-model and 0.0992 in buy and hold strategy, respectively. The other indicators including the accuracy are also modified. Compared with the increases of 0.1648, 0.4051, and 0.3397 in Shanghai Composite Index, Shenzhen Composite Index, and CSI 300 Index, MF-MSHD shows higher profitability in stock trading signal prediction.
KW - Model fusion
KW - Multi-source heterogeneous data
KW - Sentiment analysis
KW - Stock trading signal prediction
UR - http://www.scopus.com/inward/record.url?scp=85143488973&partnerID=8YFLogxK
U2 - 10.1007/s00500-022-07714-4
DO - 10.1007/s00500-022-07714-4
M3 - Article
AN - SCOPUS:85143488973
SN - 1432-7643
VL - 27
SP - 6587
EP - 6611
JO - Soft Computing
JF - Soft Computing
IS - 10
ER -