TY - GEN
T1 - A study of collection-based features for adapting the balance parameter in pseudo relevance feedback
AU - Meng, Ye
AU - Zhang, Peng
AU - Song, Dawei
AU - Hou, Yuexian
N1 - Publisher Copyright:
© Springer International Publishing Switzerland 2015.
PY - 2015
Y1 - 2015
N2 - Pseudo-relevance feedback (PRF) is an effective technique to improve the ad-hoc retrieval performance. For PRF methods, how to optimize the balance parameter between the original query model and feedback model is an important but difficult problem. Traditionally, the balance parameter is often manually tested and set to a fixed value across collections and queries. However, due to the difference among collections and individual queries, this parameter should be tuned differently. Recent research has studied various query based and feedback documents based features to predict the optimal balance parameter for each query on a specific collection, through a learning approach based on logistic regression. In this paper, we hypothesize that characteristics of collections are also important for the prediction. We propose and systematically investigate a series of collection-based features for queries, feedback documents and candidate expansion terms. The experiments show that our method is competitive in improving retrieval performance and particularly for cross-collection prediction, in comparison with the state-of-the-art approaches.
AB - Pseudo-relevance feedback (PRF) is an effective technique to improve the ad-hoc retrieval performance. For PRF methods, how to optimize the balance parameter between the original query model and feedback model is an important but difficult problem. Traditionally, the balance parameter is often manually tested and set to a fixed value across collections and queries. However, due to the difference among collections and individual queries, this parameter should be tuned differently. Recent research has studied various query based and feedback documents based features to predict the optimal balance parameter for each query on a specific collection, through a learning approach based on logistic regression. In this paper, we hypothesize that characteristics of collections are also important for the prediction. We propose and systematically investigate a series of collection-based features for queries, feedback documents and candidate expansion terms. The experiments show that our method is competitive in improving retrieval performance and particularly for cross-collection prediction, in comparison with the state-of-the-art approaches.
KW - Collection characteristics
KW - Information retrieval
KW - Pseudo-relevance feedback
UR - http://www.scopus.com/inward/record.url?scp=84958051513&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-28940-3_21
DO - 10.1007/978-3-319-28940-3_21
M3 - Conference contribution
AN - SCOPUS:84958051513
SN - 9783319289397
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 265
EP - 276
BT - Information Retrieval Technology - 11th Asia Information Retrieval Societies Conference, AIRS 2015, Proceedings
A2 - Scholer, Falk
A2 - Zuccon, Guido
A2 - Geva, Shlomo
A2 - Sun, Aixin
A2 - Joho, Hideo
A2 - Zhang, Peng
PB - Springer Verlag
T2 - 11th Asia Information Retrieval Societies Conference, AIRS 2015
Y2 - 2 December 2015 through 4 December 2015
ER -