TY - JOUR
T1 - The research and implementation of a large-scale real-time news recommendation algorithm
AU - Zhao, Xiaolin
AU - Zeng, Chonghan
AU - Wu, Chong
AU - Hu, Jingjing
AU - Li, Yue
N1 - Publisher Copyright:
© 2018 Indian Pulp and Paper Technical Association. All Rights Reserved.
PY - 2018/10/1
Y1 - 2018/10/1
N2 - With the rapid development of computer and network technology, the Internet provides a fast way to obtain news. Every day, hundreds of millions of news items are reported around the world. Thus, people are faced with the problem of how to find interesting news quickly. This problem is addressed by personalized news recommendation systems. This paper notes that personalized news recommendation systems face four major challenges: real-time, scalability, novelty, and diversity. To address these four challenges, we analyze two existing news recommendation algorithms and their associated advantages and disadvantages. Due to the shortcomings of the existing algorithms, two novel methods are proposed: similar-document-based real-time news recommendation and user-preference-cluster-based real-time news recommendation. The former lacks diversity, and the latter lacks novelty. Then, we propose a novel hybrid method that combines the two algorithms to overcome their shortcomings. The hybrid algorithm satisfies the four characteristics, and we design several of experiments to evaluate whether our algorithm is better than others. Finally, we use large-scale data to test our news recommendation system to verify the feasibility of the proposed algorithm for large-scale data and real sys.
AB - With the rapid development of computer and network technology, the Internet provides a fast way to obtain news. Every day, hundreds of millions of news items are reported around the world. Thus, people are faced with the problem of how to find interesting news quickly. This problem is addressed by personalized news recommendation systems. This paper notes that personalized news recommendation systems face four major challenges: real-time, scalability, novelty, and diversity. To address these four challenges, we analyze two existing news recommendation algorithms and their associated advantages and disadvantages. Due to the shortcomings of the existing algorithms, two novel methods are proposed: similar-document-based real-time news recommendation and user-preference-cluster-based real-time news recommendation. The former lacks diversity, and the latter lacks novelty. Then, we propose a novel hybrid method that combines the two algorithms to overcome their shortcomings. The hybrid algorithm satisfies the four characteristics, and we design several of experiments to evaluate whether our algorithm is better than others. Finally, we use large-scale data to test our news recommendation system to verify the feasibility of the proposed algorithm for large-scale data and real sys.
KW - LSH
KW - MinHash
KW - News
KW - Real-time recommendation
KW - Recommender system
KW - Spark
UR - http://www.scopus.com/inward/record.url?scp=85057335537&partnerID=8YFLogxK
M3 - Article
AN - SCOPUS:85057335537
SN - 0379-5462
VL - 30
SP - 361
EP - 370
JO - IPPTA: Quarterly Journal of Indian Pulp and Paper Technical Association
JF - IPPTA: Quarterly Journal of Indian Pulp and Paper Technical Association
IS - 4
ER -