TY - GEN
T1 - Document boltzmann machines for information retrieval
AU - Yu, Qian
AU - Zhang, Peng
AU - Hou, Yuexian
AU - Song, Dawei
AU - Wang, Jun
N1 - Publisher Copyright:
© Springer International Publishing Switzerland 2015.
PY - 2015
Y1 - 2015
N2 - Probabilistic language modelling has been widely used in information retrieval. It estimates document models under the multinomial distribution assumption, and uses query likelihood to rank documents. In this paper, we aim to generalize this distribution assumption by exploring the use of fully-observable Boltzmann Machines (BMs) for document modelling. BM is a stochastic recurrent network and is able to model the distribution of multi-dimensional variables. It yields a kind of Boltzmann distribution which is more general than multinomial distribution. We propose a Document Boltzmann Machine (DBM) that can naturally capture the intrinsic connections among terms and estimate query likelihood efficiently. We formally prove that under certain conditions (with 1-order parameters learnt only), DBM subsumes the traditional document language model. Its relations to other graphical models in IR, e.g., MRF model, are also discussed. Our experiments on the document reranking demonstrate the potential of the proposed DBM.
AB - Probabilistic language modelling has been widely used in information retrieval. It estimates document models under the multinomial distribution assumption, and uses query likelihood to rank documents. In this paper, we aim to generalize this distribution assumption by exploring the use of fully-observable Boltzmann Machines (BMs) for document modelling. BM is a stochastic recurrent network and is able to model the distribution of multi-dimensional variables. It yields a kind of Boltzmann distribution which is more general than multinomial distribution. We propose a Document Boltzmann Machine (DBM) that can naturally capture the intrinsic connections among terms and estimate query likelihood efficiently. We formally prove that under certain conditions (with 1-order parameters learnt only), DBM subsumes the traditional document language model. Its relations to other graphical models in IR, e.g., MRF model, are also discussed. Our experiments on the document reranking demonstrate the potential of the proposed DBM.
UR - http://www.scopus.com/inward/record.url?scp=84925435410&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-16354-3_73
DO - 10.1007/978-3-319-16354-3_73
M3 - Conference contribution
AN - SCOPUS:84925435410
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 666
EP - 671
BT - Advances in Information Retrieval - 37th European Conference on IR Research, ECIR 2015, Proceedings
A2 - Hanbury, Allan
A2 - Rauber, Andreas
A2 - Kazai, Gabriella
A2 - Fuhr, Norbert
PB - Springer Verlag
T2 - 37th European Conference on Information Retrieval Research, ECIR 2015
Y2 - 29 March 2015 through 2 April 2015
ER -