Complex-valued Neural Network-based Quantum Language Models

Peng Zhang; Wenjie Hui; Benyou Wang; Donghao Zhao; Dawei Song; Christina Lioma; Jakob Grue Simonsen

doi:10.1145/3505138

Complex-valued Neural Network-based Quantum Language Models

Peng Zhang, Wenjie Hui, Benyou Wang^*, Donghao Zhao, Dawei Song, Christina Lioma, Jakob Grue Simonsen

^*此作品的通讯作者

计算机学院

科研成果: 期刊稿件 › 文章 › 同行评审

13 引用（Scopus）

摘要

Language modeling is essential in Natural Language Processing and Information Retrieval related tasks. After the statistical language models, Quantum Language Model (QLM) has been proposed to unify both single words and compound terms in the same probability space without extending term space exponentially. Although QLM achieved good performance in ad hoc retrieval, it still has two major limitations: (1) QLM cannot make use of supervised information, mainly due to the iterative and non-differentiable estimation of the density matrix, which represents both queries and documents in QLM. (2) QLM assumes the exchangeability of words or word dependencies, neglecting the order or position information of words.This article aims to generalize QLM and make it applicable to more complicated matching tasks (e.g., Question Answering) beyond ad hoc retrieval. We propose a complex-valued neural network-based QLM solution called C-NNQLM to employ an end-to-end approach to build and train density matrices in a light-weight and differentiable manner, and it can therefore make use of external well-trained word vectors and supervised labels. Furthermore, C-NNQLM adopts complex-valued word vectors whose phase vectors can directly encode the order (or position) information of words. Note that complex numbers are also essential in the quantum theory. We show that the real-valued NNQLM (R-NNQLM) is a special case of C-NNQLM.The experimental results on the QA task show that both R-NNQLM and C-NNQLM achieve much better performance than the vanilla QLM, and C-NNQLM's performance is on par with state-of-the-art neural network models. We also evaluate the proposed C-NNQLM on text classification and document retrieval tasks. The results on most datasets show that the C-NNQLM can outperform R-NNQLM, which demonstrates the usefulness of the complex representation for words and sentences in C-NNQLM.

源语言	英语
文章编号	84
期刊	ACM Transactions on Information Systems
卷	40
期	4
DOI	https://doi.org/10.1145/3505138
出版状态	已出版 - 10月 2022

访问文件

10.1145/3505138

其它文件与链接

链接到 Scopus 的出版物

引用此

Zhang, P., Hui, W., Wang, B., Zhao, D., Song, D., Lioma, C., & Simonsen, J. G. (2022). Complex-valued Neural Network-based Quantum Language Models. ACM Transactions on Information Systems, 40(4), 文章 84. https://doi.org/10.1145/3505138

@article{723889a24dac47f28cec00eac3b6d6a3,

title = "Complex-valued Neural Network-based Quantum Language Models",

abstract = "Language modeling is essential in Natural Language Processing and Information Retrieval related tasks. After the statistical language models, Quantum Language Model (QLM) has been proposed to unify both single words and compound terms in the same probability space without extending term space exponentially. Although QLM achieved good performance in ad hoc retrieval, it still has two major limitations: (1) QLM cannot make use of supervised information, mainly due to the iterative and non-differentiable estimation of the density matrix, which represents both queries and documents in QLM. (2) QLM assumes the exchangeability of words or word dependencies, neglecting the order or position information of words.This article aims to generalize QLM and make it applicable to more complicated matching tasks (e.g., Question Answering) beyond ad hoc retrieval. We propose a complex-valued neural network-based QLM solution called C-NNQLM to employ an end-to-end approach to build and train density matrices in a light-weight and differentiable manner, and it can therefore make use of external well-trained word vectors and supervised labels. Furthermore, C-NNQLM adopts complex-valued word vectors whose phase vectors can directly encode the order (or position) information of words. Note that complex numbers are also essential in the quantum theory. We show that the real-valued NNQLM (R-NNQLM) is a special case of C-NNQLM.The experimental results on the QA task show that both R-NNQLM and C-NNQLM achieve much better performance than the vanilla QLM, and C-NNQLM's performance is on par with state-of-the-art neural network models. We also evaluate the proposed C-NNQLM on text classification and document retrieval tasks. The results on most datasets show that the C-NNQLM can outperform R-NNQLM, which demonstrates the usefulness of the complex representation for words and sentences in C-NNQLM.",

keywords = "Quantum theory, language model, neural network, question answering",

author = "Peng Zhang and Wenjie Hui and Benyou Wang and Donghao Zhao and Dawei Song and Christina Lioma and Simonsen, {Jakob Grue}",

note = "Publisher Copyright: {\textcopyright} 2022 Association for Computing Machinery.",

year = "2022",

month = oct,

doi = "10.1145/3505138",

language = "English",

volume = "40",

journal = "ACM Transactions on Information Systems",

issn = "1046-8188",

publisher = "Association for Computing Machinery (ACM)",

number = "4",

}

TY - JOUR

T1 - Complex-valued Neural Network-based Quantum Language Models

AU - Zhang, Peng

AU - Hui, Wenjie

AU - Wang, Benyou

AU - Zhao, Donghao

AU - Song, Dawei

AU - Lioma, Christina

AU - Simonsen, Jakob Grue

PY - 2022/10

Y1 - 2022/10

N2 - Language modeling is essential in Natural Language Processing and Information Retrieval related tasks. After the statistical language models, Quantum Language Model (QLM) has been proposed to unify both single words and compound terms in the same probability space without extending term space exponentially. Although QLM achieved good performance in ad hoc retrieval, it still has two major limitations: (1) QLM cannot make use of supervised information, mainly due to the iterative and non-differentiable estimation of the density matrix, which represents both queries and documents in QLM. (2) QLM assumes the exchangeability of words or word dependencies, neglecting the order or position information of words.This article aims to generalize QLM and make it applicable to more complicated matching tasks (e.g., Question Answering) beyond ad hoc retrieval. We propose a complex-valued neural network-based QLM solution called C-NNQLM to employ an end-to-end approach to build and train density matrices in a light-weight and differentiable manner, and it can therefore make use of external well-trained word vectors and supervised labels. Furthermore, C-NNQLM adopts complex-valued word vectors whose phase vectors can directly encode the order (or position) information of words. Note that complex numbers are also essential in the quantum theory. We show that the real-valued NNQLM (R-NNQLM) is a special case of C-NNQLM.The experimental results on the QA task show that both R-NNQLM and C-NNQLM achieve much better performance than the vanilla QLM, and C-NNQLM's performance is on par with state-of-the-art neural network models. We also evaluate the proposed C-NNQLM on text classification and document retrieval tasks. The results on most datasets show that the C-NNQLM can outperform R-NNQLM, which demonstrates the usefulness of the complex representation for words and sentences in C-NNQLM.

AB - Language modeling is essential in Natural Language Processing and Information Retrieval related tasks. After the statistical language models, Quantum Language Model (QLM) has been proposed to unify both single words and compound terms in the same probability space without extending term space exponentially. Although QLM achieved good performance in ad hoc retrieval, it still has two major limitations: (1) QLM cannot make use of supervised information, mainly due to the iterative and non-differentiable estimation of the density matrix, which represents both queries and documents in QLM. (2) QLM assumes the exchangeability of words or word dependencies, neglecting the order or position information of words.This article aims to generalize QLM and make it applicable to more complicated matching tasks (e.g., Question Answering) beyond ad hoc retrieval. We propose a complex-valued neural network-based QLM solution called C-NNQLM to employ an end-to-end approach to build and train density matrices in a light-weight and differentiable manner, and it can therefore make use of external well-trained word vectors and supervised labels. Furthermore, C-NNQLM adopts complex-valued word vectors whose phase vectors can directly encode the order (or position) information of words. Note that complex numbers are also essential in the quantum theory. We show that the real-valued NNQLM (R-NNQLM) is a special case of C-NNQLM.The experimental results on the QA task show that both R-NNQLM and C-NNQLM achieve much better performance than the vanilla QLM, and C-NNQLM's performance is on par with state-of-the-art neural network models. We also evaluate the proposed C-NNQLM on text classification and document retrieval tasks. The results on most datasets show that the C-NNQLM can outperform R-NNQLM, which demonstrates the usefulness of the complex representation for words and sentences in C-NNQLM.

KW - Quantum theory

KW - language model

KW - neural network

KW - question answering

UR - http://www.scopus.com/inward/record.url?scp=85130242036&partnerID=8YFLogxK

U2 - 10.1145/3505138

DO - 10.1145/3505138

M3 - Article

AN - SCOPUS:85130242036

SN - 1046-8188

VL - 40

JO - ACM Transactions on Information Systems

JF - ACM Transactions on Information Systems

IS - 4

M1 - 84

ER -

Complex-valued Neural Network-based Quantum Language Models

摘要

访问文件

其它文件与链接

指纹

引用此