A lexicon-based multi-class semantic orientation analysis for microblogs

Yuqing Li; Xin Li; Fan Li; Xiaofeng Zhang

doi:10.1007/978-3-319-11116-2_8

A lexicon-based multi-class semantic orientation analysis for microblogs

Yuqing Li, Xin Li^*, Fan Li, Xiaofeng Zhang

^*此作品的通讯作者

计算机学院

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

8 引用（Scopus）

摘要

In the literature, most of existing works of semantic orientation analysis focus on the distinguishment of two polarities (positive and negative). In this paper, we propose a lexicon-based multi-class semantic orientation analysis for microblogs. To better capture the social attention on public events, we introduce Concern into the conventional psychological classes of sentiments and build up a sentiment lexicon with five categories(Concern, Joy, Blue, Anger, Fear). The seed words of the lexicon are extracted from HowNet, NTUSD, and catchwords of the Sina Weibo posts. The semantic similarity in HowNet is adopted to detect more sentiment words to enrich the lexicon. Accordingly, each Weibo post is represented as a multi-dimensional numerical vector in feature space. Then we adopt the Semi-Supervised Gaussian Mixture Model (Semi-GMM) and an adaptive K-nearst neighbour (KNN) with symmetric Kullback-Leibler divergence (KL-divergence) as similarity measurements to classify the posts. We compare our proposed methodologies with a few competitive baseline methods e.g., majority vote, KNN by using Cosine similarity, and SVM. The experimental evaluation shows that our proposed methods outperform other approaches by a large margin in terms of the accuracy and F1 score.

源语言	英语
主期刊名	Web Technologies and Applications - 16th Asia-Pacific Web Conference, APWeb 2014, Proceedings
出版商	Springer Verlag
页	81-92
页数	12
ISBN（印刷版）	9783319111155
DOI	https://doi.org/10.1007/978-3-319-11116-2_8
出版状态	已出版 - 2014
活动	16th Asia-Pacific Web Conference on Web Technologies and Applications, APWeb 2014 - Changsha, 中国期限: 5 9月 2014 → 7 9月 2014

出版系列

姓名	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
卷	8709 LNCS
ISSN（印刷版）	0302-9743
ISSN（电子版）	1611-3349

会议

会议	16th Asia-Pacific Web Conference on Web Technologies and Applications, APWeb 2014
国家/地区	中国
市	Changsha
时期	5/09/14 → 7/09/14

访问文件

10.1007/978-3-319-11116-2_8

其它文件与链接

链接到 Scopus 的出版物

引用此

Li, Y., Li, X., Li, F., & Zhang, X. (2014). A lexicon-based multi-class semantic orientation analysis for microblogs. 在 Web Technologies and Applications - 16th Asia-Pacific Web Conference, APWeb 2014, Proceedings (页码 81-92). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 卷 8709 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-319-11116-2_8

Li, Yuqing ; Li, Xin ; Li, Fan 等. / A lexicon-based multi-class semantic orientation analysis for microblogs. Web Technologies and Applications - 16th Asia-Pacific Web Conference, APWeb 2014, Proceedings. Springer Verlag, 2014. 页码 81-92 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{5d63c2485cb24a739becc922ea90fbd7,

title = "A lexicon-based multi-class semantic orientation analysis for microblogs",

abstract = "In the literature, most of existing works of semantic orientation analysis focus on the distinguishment of two polarities (positive and negative). In this paper, we propose a lexicon-based multi-class semantic orientation analysis for microblogs. To better capture the social attention on public events, we introduce Concern into the conventional psychological classes of sentiments and build up a sentiment lexicon with five categories(Concern, Joy, Blue, Anger, Fear). The seed words of the lexicon are extracted from HowNet, NTUSD, and catchwords of the Sina Weibo posts. The semantic similarity in HowNet is adopted to detect more sentiment words to enrich the lexicon. Accordingly, each Weibo post is represented as a multi-dimensional numerical vector in feature space. Then we adopt the Semi-Supervised Gaussian Mixture Model (Semi-GMM) and an adaptive K-nearst neighbour (KNN) with symmetric Kullback-Leibler divergence (KL-divergence) as similarity measurements to classify the posts. We compare our proposed methodologies with a few competitive baseline methods e.g., majority vote, KNN by using Cosine similarity, and SVM. The experimental evaluation shows that our proposed methods outperform other approaches by a large margin in terms of the accuracy and F1 score.",

keywords = "Kullback-Leibler divergence, Semantic Orientation Analysis, Semi-supervised Gaussian mixture model (Semi-GMM)",

author = "Yuqing Li and Xin Li and Fan Li and Xiaofeng Zhang",

year = "2014",

doi = "10.1007/978-3-319-11116-2_8",

language = "English",

isbn = "9783319111155",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Verlag",

pages = "81--92",

booktitle = "Web Technologies and Applications - 16th Asia-Pacific Web Conference, APWeb 2014, Proceedings",

address = "Germany",

note = "16th Asia-Pacific Web Conference on Web Technologies and Applications, APWeb 2014 ; Conference date: 05-09-2014 Through 07-09-2014",

}

Li, Y, Li, X , Li, F & Zhang, X 2014, A lexicon-based multi-class semantic orientation analysis for microblogs. 在 Web Technologies and Applications - 16th Asia-Pacific Web Conference, APWeb 2014, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 卷 8709 LNCS, Springer Verlag, 页码 81-92, 16th Asia-Pacific Web Conference on Web Technologies and Applications, APWeb 2014, Changsha, 中国, 5/09/14. https://doi.org/10.1007/978-3-319-11116-2_8

A lexicon-based multi-class semantic orientation analysis for microblogs. / Li, Yuqing; Li, Xin ; Li, Fan 等.
Web Technologies and Applications - 16th Asia-Pacific Web Conference, APWeb 2014, Proceedings. Springer Verlag, 2014. 页码 81-92 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 卷 8709 LNCS).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - A lexicon-based multi-class semantic orientation analysis for microblogs

AU - Li, Yuqing

AU - Li, Xin

AU - Li, Fan

AU - Zhang, Xiaofeng

PY - 2014

Y1 - 2014

N2 - In the literature, most of existing works of semantic orientation analysis focus on the distinguishment of two polarities (positive and negative). In this paper, we propose a lexicon-based multi-class semantic orientation analysis for microblogs. To better capture the social attention on public events, we introduce Concern into the conventional psychological classes of sentiments and build up a sentiment lexicon with five categories(Concern, Joy, Blue, Anger, Fear). The seed words of the lexicon are extracted from HowNet, NTUSD, and catchwords of the Sina Weibo posts. The semantic similarity in HowNet is adopted to detect more sentiment words to enrich the lexicon. Accordingly, each Weibo post is represented as a multi-dimensional numerical vector in feature space. Then we adopt the Semi-Supervised Gaussian Mixture Model (Semi-GMM) and an adaptive K-nearst neighbour (KNN) with symmetric Kullback-Leibler divergence (KL-divergence) as similarity measurements to classify the posts. We compare our proposed methodologies with a few competitive baseline methods e.g., majority vote, KNN by using Cosine similarity, and SVM. The experimental evaluation shows that our proposed methods outperform other approaches by a large margin in terms of the accuracy and F1 score.

AB - In the literature, most of existing works of semantic orientation analysis focus on the distinguishment of two polarities (positive and negative). In this paper, we propose a lexicon-based multi-class semantic orientation analysis for microblogs. To better capture the social attention on public events, we introduce Concern into the conventional psychological classes of sentiments and build up a sentiment lexicon with five categories(Concern, Joy, Blue, Anger, Fear). The seed words of the lexicon are extracted from HowNet, NTUSD, and catchwords of the Sina Weibo posts. The semantic similarity in HowNet is adopted to detect more sentiment words to enrich the lexicon. Accordingly, each Weibo post is represented as a multi-dimensional numerical vector in feature space. Then we adopt the Semi-Supervised Gaussian Mixture Model (Semi-GMM) and an adaptive K-nearst neighbour (KNN) with symmetric Kullback-Leibler divergence (KL-divergence) as similarity measurements to classify the posts. We compare our proposed methodologies with a few competitive baseline methods e.g., majority vote, KNN by using Cosine similarity, and SVM. The experimental evaluation shows that our proposed methods outperform other approaches by a large margin in terms of the accuracy and F1 score.

KW - Kullback-Leibler divergence

KW - Semantic Orientation Analysis

KW - Semi-supervised Gaussian mixture model (Semi-GMM)

UR - http://www.scopus.com/inward/record.url?scp=84958545721&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-11116-2_8

DO - 10.1007/978-3-319-11116-2_8

M3 - Conference contribution

AN - SCOPUS:84958545721

SN - 9783319111155

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 81

EP - 92

BT - Web Technologies and Applications - 16th Asia-Pacific Web Conference, APWeb 2014, Proceedings

PB - Springer Verlag

T2 - 16th Asia-Pacific Web Conference on Web Technologies and Applications, APWeb 2014

Y2 - 5 September 2014 through 7 September 2014

ER -

Li Y, Li X , Li F, Zhang X. A lexicon-based multi-class semantic orientation analysis for microblogs. 在 Web Technologies and Applications - 16th Asia-Pacific Web Conference, APWeb 2014, Proceedings. Springer Verlag. 2014. 页码 81-92. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-319-11116-2_8

A lexicon-based multi-class semantic orientation analysis for microblogs

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此