TY - GEN
T1 - Fine-grained product features extraction and categorization in reviews opinion mining
AU - Huang, Sheng
AU - Liu, Xinlan
AU - Peng, Xueping
AU - Niu, Zhendong
PY - 2012
Y1 - 2012
N2 - With the growth of user-generated contents on the Web, product reviews opinion mining increasingly becomes a research practice of great value to e-commerce, search and recommendation. Unfortunately, the number of reviews is rising up to hundreds or even thousands, especially for some popular items, which makes it a laborious work for the potential buyers and the manufacturers to read through them to make a wise decision. Besides, the free format and the uncertainty of reviews expressions, make fine-grained product features extraction and categorization a more difficult task than traditional information extraction techniques. In this work, we propose to treat product feature extraction as a sequence labeling task and employ a discriminative learning model using Conditional Random Fields (CRFs) to tackle it. We innovatively incorporate the part-of-speech features and the sentence structure features into the CRFs learning process. For product feature categorization, we introduce the semantic knowledge-based and distributional context-based similarity measures to calculate the similarities between product feature expressions, then an effective graph pruning based categorizing algorithm is proposed to classify the collection of feature expressions into different semantic groups. The empirical studies have proved the effectiveness and efficiency of our approaches compared with other counterpart methods.
AB - With the growth of user-generated contents on the Web, product reviews opinion mining increasingly becomes a research practice of great value to e-commerce, search and recommendation. Unfortunately, the number of reviews is rising up to hundreds or even thousands, especially for some popular items, which makes it a laborious work for the potential buyers and the manufacturers to read through them to make a wise decision. Besides, the free format and the uncertainty of reviews expressions, make fine-grained product features extraction and categorization a more difficult task than traditional information extraction techniques. In this work, we propose to treat product feature extraction as a sequence labeling task and employ a discriminative learning model using Conditional Random Fields (CRFs) to tackle it. We innovatively incorporate the part-of-speech features and the sentence structure features into the CRFs learning process. For product feature categorization, we introduce the semantic knowledge-based and distributional context-based similarity measures to calculate the similarities between product feature expressions, then an effective graph pruning based categorizing algorithm is proposed to classify the collection of feature expressions into different semantic groups. The empirical studies have proved the effectiveness and efficiency of our approaches compared with other counterpart methods.
KW - Conditional random fields
KW - Extraction and categorization
KW - Product features
KW - Similarity calculation
UR - https://www.scopus.com/pages/publications/84873202270
U2 - 10.1109/ICDMW.2012.53
DO - 10.1109/ICDMW.2012.53
M3 - Conference contribution
AN - SCOPUS:84873202270
SN - 9780769549255
T3 - Proceedings - 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012
SP - 680
EP - 686
BT - Proceedings - 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012
T2 - 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012
Y2 - 10 December 2012 through 10 December 2012
ER -