Improving top- N recommendation performance using missing data

Xiangyu Zhao; Zhendong Niu; Kaiyi Wang; Ke Niu; Zhongqiang Liu

doi:10.1155/2015/380472

Improving top- N recommendation performance using missing data

Xiangyu Zhao^*, Zhendong Niu, Kaiyi Wang, Ke Niu, Zhongqiang Liu

^*Corresponding author for this work

School of Computer Science and Technology

Research output: Contribution to journal › Article › peer-review

14 Citations (Scopus)

Abstract

Recommender systems become increasingly significant in solving the information explosion problem. Data sparse is a main challenge in this area. Massive unrated items constitute missing data with only a few observed ratings. Most studies consider missing data as unknown information and only use observed data to learn models and generate recommendations. However, data are missing not at random. Part of missing data is due to the fact that users choose not to rate them. This part of missing data is negative examples of user preferences. Utilizing this information is expected to leverage the performance of recommendation algorithms. Unfortunately, negative examples are mixed with unlabeled positive examples in missing data, and they are hard to be distinguished. In this paper, we propose three schemes to utilize the negative examples in missing data. The schemes are then adapted with SVD++, which is a state-of-the-art matrix factorization recommendation approach, to generate recommendations. Experimental results on two real datasets show that our proposed approaches gain better top-N performance than the baseline ones on both accuracy and diversity.

Original language	English
Article number	380472
Journal	Mathematical Problems in Engineering
Volume	2015
DOIs	https://doi.org/10.1155/2015/380472
Publication status	Published - 2015

Access to Document

10.1155/2015/380472

Cite this

@article{782154af0b5d439d931253fa3d7e14a9,

title = "Improving top- N recommendation performance using missing data",

abstract = "Recommender systems become increasingly significant in solving the information explosion problem. Data sparse is a main challenge in this area. Massive unrated items constitute missing data with only a few observed ratings. Most studies consider missing data as unknown information and only use observed data to learn models and generate recommendations. However, data are missing not at random. Part of missing data is due to the fact that users choose not to rate them. This part of missing data is negative examples of user preferences. Utilizing this information is expected to leverage the performance of recommendation algorithms. Unfortunately, negative examples are mixed with unlabeled positive examples in missing data, and they are hard to be distinguished. In this paper, we propose three schemes to utilize the negative examples in missing data. The schemes are then adapted with SVD++, which is a state-of-the-art matrix factorization recommendation approach, to generate recommendations. Experimental results on two real datasets show that our proposed approaches gain better top-N performance than the baseline ones on both accuracy and diversity.",

author = "Xiangyu Zhao and Zhendong Niu and Kaiyi Wang and Ke Niu and Zhongqiang Liu",

note = "Publisher Copyright: {\textcopyright} 2015 Xiangyu Zhao et al.",

year = "2015",

doi = "10.1155/2015/380472",

language = "English",

volume = "2015",

journal = "Mathematical Problems in Engineering",

issn = "1024-123X",

publisher = "Hindawi Publishing Corporation",

}

TY - JOUR

T1 - Improving top- N recommendation performance using missing data

AU - Zhao, Xiangyu

AU - Niu, Zhendong

AU - Wang, Kaiyi

AU - Niu, Ke

AU - Liu, Zhongqiang

PY - 2015

Y1 - 2015

N2 - Recommender systems become increasingly significant in solving the information explosion problem. Data sparse is a main challenge in this area. Massive unrated items constitute missing data with only a few observed ratings. Most studies consider missing data as unknown information and only use observed data to learn models and generate recommendations. However, data are missing not at random. Part of missing data is due to the fact that users choose not to rate them. This part of missing data is negative examples of user preferences. Utilizing this information is expected to leverage the performance of recommendation algorithms. Unfortunately, negative examples are mixed with unlabeled positive examples in missing data, and they are hard to be distinguished. In this paper, we propose three schemes to utilize the negative examples in missing data. The schemes are then adapted with SVD++, which is a state-of-the-art matrix factorization recommendation approach, to generate recommendations. Experimental results on two real datasets show that our proposed approaches gain better top-N performance than the baseline ones on both accuracy and diversity.

AB - Recommender systems become increasingly significant in solving the information explosion problem. Data sparse is a main challenge in this area. Massive unrated items constitute missing data with only a few observed ratings. Most studies consider missing data as unknown information and only use observed data to learn models and generate recommendations. However, data are missing not at random. Part of missing data is due to the fact that users choose not to rate them. This part of missing data is negative examples of user preferences. Utilizing this information is expected to leverage the performance of recommendation algorithms. Unfortunately, negative examples are mixed with unlabeled positive examples in missing data, and they are hard to be distinguished. In this paper, we propose three schemes to utilize the negative examples in missing data. The schemes are then adapted with SVD++, which is a state-of-the-art matrix factorization recommendation approach, to generate recommendations. Experimental results on two real datasets show that our proposed approaches gain better top-N performance than the baseline ones on both accuracy and diversity.

UR - http://www.scopus.com/inward/record.url?scp=84942284303&partnerID=8YFLogxK

U2 - 10.1155/2015/380472

DO - 10.1155/2015/380472

M3 - Article

AN - SCOPUS:84942284303

SN - 1024-123X

VL - 2015

JO - Mathematical Problems in Engineering

JF - Mathematical Problems in Engineering

M1 - 380472

ER -

Improving top- N recommendation performance using missing data

Abstract

Access to Document

Other files and links

Fingerprint

Cite this