TY - GEN
T1 - Learning to rank microblog posts for real-time AD-HOC search
AU - Li, Jing
AU - Wei, Zhongyu
AU - Wei, Hao
AU - Zhao, Kangfei
AU - Chen, Junwen
AU - Wong, Kam Fai
N1 - Publisher Copyright:
© Springer International Publishing Switzerland 2015.
PY - 2015
Y1 - 2015
N2 - Microblogging websites have emerged to the center of information production and diffusion, on which people can get useful information from other users’ microblog posts. In the era of Big Data, we are overwhelmed by the large amount of microblog posts. To make good use of these informative data, an effective search tool is required specialized for microblog posts. However, it is not trivial to do microblog search due to the following reasons: 1) microblog posts are noisy and time-sensitive rendering general information retrieval models ineffective. 2) Conventional IR models are not designed to consider microblog-specific features. In this paper, we propose to utilize learning to rank model for microblog search. We combine content-based, microblog-specific and temporal features into learning to rank models, which are found to model microblog posts effectively. To study the performance of learning to rank models, we evaluate our models using tweet data set provided by TERC 2011 and TREC 2012 microblogs track with the comparison of three stateof-the-art information retrieval baselines, vector space model, language model, BM25 model. Extensive experimental studies demonstrate the effectiveness of learning to rank models and the usefulness to integrate microblog-specific and temporal information for microblog search task.
AB - Microblogging websites have emerged to the center of information production and diffusion, on which people can get useful information from other users’ microblog posts. In the era of Big Data, we are overwhelmed by the large amount of microblog posts. To make good use of these informative data, an effective search tool is required specialized for microblog posts. However, it is not trivial to do microblog search due to the following reasons: 1) microblog posts are noisy and time-sensitive rendering general information retrieval models ineffective. 2) Conventional IR models are not designed to consider microblog-specific features. In this paper, we propose to utilize learning to rank model for microblog search. We combine content-based, microblog-specific and temporal features into learning to rank models, which are found to model microblog posts effectively. To study the performance of learning to rank models, we evaluate our models using tweet data set provided by TERC 2011 and TREC 2012 microblogs track with the comparison of three stateof-the-art information retrieval baselines, vector space model, language model, BM25 model. Extensive experimental studies demonstrate the effectiveness of learning to rank models and the usefulness to integrate microblog-specific and temporal information for microblog search task.
KW - Experimental study
KW - Information retrieval
KW - Microblog search
KW - Microblogging analysis
KW - Online social network
UR - http://www.scopus.com/inward/record.url?scp=84951262867&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-25207-0_40
DO - 10.1007/978-3-319-25207-0_40
M3 - Conference contribution
AN - SCOPUS:84951262867
SN - 9783319252063
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 436
EP - 443
BT - Natural Language Processing and Chinese Computing - 4th CCF Conference, NLPCC 2015, Proceedings
A2 - Ji, Heng
A2 - Zhao, Dongyan
A2 - Feng, Yansong
A2 - Li, Juanzi
PB - Springer Verlag
T2 - 4th CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2015
Y2 - 9 October 2015 through 13 October 2015
ER -