TY - JOUR
T1 - Improved Chinese sentence semantic similarity calculation method based on multi-feature fusion
AU - Liu, Liqi
AU - Wang, Qinglin
AU - Li, Yuan
N1 - Publisher Copyright:
© Fuji Technology Press Ltd.
PY - 2021/7
Y1 - 2021/7
N2 - In this paper, an improved long short-term memory (LSTM)-based deep neural network structure is proposed for learning variable-length Chinese sentence semantic similarities. Siamese LSTM, a sequence-insensitive deep neural network model, has a limited ability to capture the semantics of natural language because it has difficulty explaining semantic differences based on the differences in syntactic structures or word order in a sentence. Therefore, the proposed model integrates the syntactic component features of the words in the sentence into a word vector representation layer to express the syntactic structure information of the sentence and the interdependence between words. Moreover, a relative position embedding layer is introduced into the model, and the relative position of the words in the sentence is mapped to a high-dimensional space to capture the local position information of the words. With this model, a parallel structure is used to map two sentences into the same high-dimensional space to obtain a fixed-length sentence vector representation. After aggregation, the sentence similarity is computed in the output layer. Experiments with Chinese sentences show that the model can achieve good results in the calculation of the semantic similarity.
AB - In this paper, an improved long short-term memory (LSTM)-based deep neural network structure is proposed for learning variable-length Chinese sentence semantic similarities. Siamese LSTM, a sequence-insensitive deep neural network model, has a limited ability to capture the semantics of natural language because it has difficulty explaining semantic differences based on the differences in syntactic structures or word order in a sentence. Therefore, the proposed model integrates the syntactic component features of the words in the sentence into a word vector representation layer to express the syntactic structure information of the sentence and the interdependence between words. Moreover, a relative position embedding layer is introduced into the model, and the relative position of the words in the sentence is mapped to a high-dimensional space to capture the local position information of the words. With this model, a parallel structure is used to map two sentences into the same high-dimensional space to obtain a fixed-length sentence vector representation. After aggregation, the sentence similarity is computed in the output layer. Experiments with Chinese sentences show that the model can achieve good results in the calculation of the semantic similarity.
KW - LSTM
KW - Relative position embedding
KW - Semantic similarity
KW - Syntactic component
UR - http://www.scopus.com/inward/record.url?scp=85111490014&partnerID=8YFLogxK
U2 - 10.20965/JACIII.2021.P0442
DO - 10.20965/JACIII.2021.P0442
M3 - Article
AN - SCOPUS:85111490014
SN - 1343-0130
VL - 25
SP - 442
EP - 449
JO - Journal of Advanced Computational Intelligence and Intelligent Informatics
JF - Journal of Advanced Computational Intelligence and Intelligent Informatics
IS - 4
ER -