TY - JOUR
T1 - Mapping sentences to concept transferred space for semantic textual similarity
AU - Huang, Heyan
AU - Wu, Hao
AU - Wei, Xiaochi
AU - Gao, Yang
AU - Shi, Shumin
N1 - Publisher Copyright:
© 2018, Springer-Verlag London Ltd., part of Springer Nature.
PY - 2019/9/1
Y1 - 2019/9/1
N2 - Semantic textual similarity (STS) seeks to assess the degree of semantic equivalence between two sentences or snippets of texts. Most methods of STS are based on word surface and deem words as meaning unrelated symbols, which makes these methods indiscriminative for ubiquitous conceptual association among words. Recently, concept transferred space (CTS) is proposed to solve word conceptual association problem. It is generated from the noun concepts with their IS-A relations in WordNet. However, the CTS-based model can only calculate nouns; as a result, a large number of words, i.e., verbs, adjectives, adverbs as well as out-of-vocabulary named entities (OOV NEs), are neglected, thus resulting in information loss in the semantic similarity evaluation. This paper presents ways to solve this problem: To involve words other than nouns, derivational links in WordNet are employed to associate verbs, adjectives, and adverbs with their corresponding noun concepts; to prevent information loss by OOV NEs, the increased quantity of information of them is predicted according to the tendency learned from known NEs. Moreover, to further improve the accuracy of the CTS-based model, we take the importance of different types of words into consideration by assigning corresponding weights for them. Experimental results suggest that the proposed comprehensive CTS-based model achieves significant improvement compared with the primitive one without the non-nominal words, OOV NEs, and word weights and also outperforms all the yearly state-of-the-art systems at the *SEM/SemEval 2013–2016 STS tasks. Additionally, at the SemEval 2017 STS task, our team with the comprehensive CTS-based model ranked the second and the first among all teams and on Track 1 dataset, respectively.
AB - Semantic textual similarity (STS) seeks to assess the degree of semantic equivalence between two sentences or snippets of texts. Most methods of STS are based on word surface and deem words as meaning unrelated symbols, which makes these methods indiscriminative for ubiquitous conceptual association among words. Recently, concept transferred space (CTS) is proposed to solve word conceptual association problem. It is generated from the noun concepts with their IS-A relations in WordNet. However, the CTS-based model can only calculate nouns; as a result, a large number of words, i.e., verbs, adjectives, adverbs as well as out-of-vocabulary named entities (OOV NEs), are neglected, thus resulting in information loss in the semantic similarity evaluation. This paper presents ways to solve this problem: To involve words other than nouns, derivational links in WordNet are employed to associate verbs, adjectives, and adverbs with their corresponding noun concepts; to prevent information loss by OOV NEs, the increased quantity of information of them is predicted according to the tendency learned from known NEs. Moreover, to further improve the accuracy of the CTS-based model, we take the importance of different types of words into consideration by assigning corresponding weights for them. Experimental results suggest that the proposed comprehensive CTS-based model achieves significant improvement compared with the primitive one without the non-nominal words, OOV NEs, and word weights and also outperforms all the yearly state-of-the-art systems at the *SEM/SemEval 2013–2016 STS tasks. Additionally, at the SemEval 2017 STS task, our team with the comprehensive CTS-based model ranked the second and the first among all teams and on Track 1 dataset, respectively.
KW - Concept transferred space
KW - Information content
KW - Semantic textual similarity
KW - WordNet
UR - http://www.scopus.com/inward/record.url?scp=85053539166&partnerID=8YFLogxK
U2 - 10.1007/s10115-018-1261-3
DO - 10.1007/s10115-018-1261-3
M3 - Article
AN - SCOPUS:85053539166
SN - 0219-1377
VL - 60
SP - 1353
EP - 1376
JO - Knowledge and Information Systems
JF - Knowledge and Information Systems
IS - 3
ER -