摘要
An in-depth research was conducted on the use of topic models to identify the topics of healthcare services. In view of semantic sparseness and the lack of co-occurrence information in the special extraction of healthcare reviews in the LDA topic model, a CO-LDA model was proposed based on word co-occurrence analysis combined with LDA topic model. Firstly, the word co-occurrence analysis method was used to analyze the corpus of the review and the word co-occurrence matrix was obtained. Secondly, the LDA topic model was used to represent corpus reviews, and then the hierarchical clustering algorithm was used to classify the features. Finally, patients' focus on healthcare service quality factors was identified. Based on the average minimum JS distance, the average Kendall correlation coefficient and the average TF-IDF, in this paper the CO-LDA model was compared with the traditional LDA model. The experiment finally shows that the recognition theme consistency of CO-LDA model is better than that of the LDA model. Through the comparison of the experimental results with the "Hospital Evaluation Standards" in China, it is found that the consistency of the former was high, which explains the effectiveness of the CO-LDA-based online medical review topic mining method.
投稿的翻译标题 | Identifying Topics of Online Healthcare Reviews Based on Improved LDA |
---|---|
源语言 | 繁体中文 |
页(从-至) | 427-434 |
页数 | 8 |
期刊 | Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology |
卷 | 39 |
期 | 4 |
DOI | |
出版状态 | 已出版 - 1 4月 2019 |
关键词
- CO-latent dirichlet allocation
- Healthcare service
- Semantic sparse
- Topic extraction
- Word co-occurrence analysis