TY - JOUR
T1 - Query Expansion with Local Conceptual Word Embeddings in Microblog Retrieval
AU - Wang, Yashen
AU - Huang, Heyan
AU - Feng, Chong
N1 - Publisher Copyright:
© 1989-2012 IEEE.
PY - 2021/4/1
Y1 - 2021/4/1
N2 - Since the length of microblog texts, such as tweets, is strictly limited to 140 characters, traditional Information Retrieval techniques suffer from the vocabulary mismatch problem severely and cannot yield good performance in the context of microblogosphere. To address this critical challenge, in this paper, we focus on the use of local conceptual word embeddings for enhance microblog retrieval effectiveness. In particular, we propose a novel $k$k-Nearest Neighbor ($k$kNN) based Query Expansion (QE) algorithm to generate words from local word embeddings to expand the original query, which leads to better understanding of the information need. Besides, in order to further satisfy users' real-time information need, we incorporate temporal evidences into the expansion algorithm, which can boost recent tweets in the retrieval results with respect to a given topic. Experimental results on the official TREC Twitter corpora demonstrate the significant superiority of our approach over baseline methods.
AB - Since the length of microblog texts, such as tweets, is strictly limited to 140 characters, traditional Information Retrieval techniques suffer from the vocabulary mismatch problem severely and cannot yield good performance in the context of microblogosphere. To address this critical challenge, in this paper, we focus on the use of local conceptual word embeddings for enhance microblog retrieval effectiveness. In particular, we propose a novel $k$k-Nearest Neighbor ($k$kNN) based Query Expansion (QE) algorithm to generate words from local word embeddings to expand the original query, which leads to better understanding of the information need. Besides, in order to further satisfy users' real-time information need, we incorporate temporal evidences into the expansion algorithm, which can boost recent tweets in the retrieval results with respect to a given topic. Experimental results on the official TREC Twitter corpora demonstrate the significant superiority of our approach over baseline methods.
KW - Microblog retrieval
KW - pseudo-relevance feedback
KW - query expansion
KW - word embeddings
UR - http://www.scopus.com/inward/record.url?scp=85102243817&partnerID=8YFLogxK
U2 - 10.1109/TKDE.2019.2945764
DO - 10.1109/TKDE.2019.2945764
M3 - Article
AN - SCOPUS:85102243817
SN - 1041-4347
VL - 33
SP - 1737
EP - 1749
JO - IEEE Transactions on Knowledge and Data Engineering
JF - IEEE Transactions on Knowledge and Data Engineering
IS - 4
M1 - 8861105
ER -