TY - JOUR
T1 - Efficient opinion summarization on comments with online-LDA
AU - Ma, Jun
AU - Luo, Senlin
AU - Yao, Jianguo
AU - Cheng, Shuxin
AU - Chen, Xi
N1 - Publisher Copyright:
© 2006-2016 by CCC Publications.
PY - 2016
Y1 - 2016
N2 - Customer reviews and comments on web pages are important information in our daily life. For example, we prefer to choose a hotel with positive comments from previous customers. As the huge amounts of such information demonstrate the characteristics of big data, it places heavy burdens on the assimilation of the customercontributed opinions. To overcoming this problem, we study an efficient opinion summarization approach for a set of massive user reviews and comments associated with an online resource, to summarize the opinions into two categories, i.e., positive and negative. In this paper, we proposed a framework including: (1) overcoming the big data problem of online comments using the efficient online-LDA approach; (2) selecting meaningful topics from the imbalanced data; (3) summarizing the opinion of comments with high precision and recall. This framework is different from much of the previous work in that the topics are pre-defined and selected the topics for better opinion summarization. To evaluate the proposed framework, we perform the experiments on a dataset of hotel reviews for the variety of topics contained. The results show that our framework can gain a significant performance improvement on opinion summarization.
AB - Customer reviews and comments on web pages are important information in our daily life. For example, we prefer to choose a hotel with positive comments from previous customers. As the huge amounts of such information demonstrate the characteristics of big data, it places heavy burdens on the assimilation of the customercontributed opinions. To overcoming this problem, we study an efficient opinion summarization approach for a set of massive user reviews and comments associated with an online resource, to summarize the opinions into two categories, i.e., positive and negative. In this paper, we proposed a framework including: (1) overcoming the big data problem of online comments using the efficient online-LDA approach; (2) selecting meaningful topics from the imbalanced data; (3) summarizing the opinion of comments with high precision and recall. This framework is different from much of the previous work in that the topics are pre-defined and selected the topics for better opinion summarization. To evaluate the proposed framework, we perform the experiments on a dataset of hotel reviews for the variety of topics contained. The results show that our framework can gain a significant performance improvement on opinion summarization.
KW - Big data
KW - Imbalanced data
KW - Latent dirichlet allocation (LDA)
KW - Online - LDA
KW - Opinion summarization
UR - http://www.scopus.com/inward/record.url?scp=84962418439&partnerID=8YFLogxK
U2 - 10.15837/ijccc.2016.3.700
DO - 10.15837/ijccc.2016.3.700
M3 - Article
AN - SCOPUS:84962418439
SN - 1841-9836
VL - 11
SP - 414
EP - 427
JO - International Journal of Computers, Communications and Control
JF - International Journal of Computers, Communications and Control
IS - 3
ER -