基于改进LDA的在线医疗评论主题挖掘

Hui Ying Gao, Jia Wei Liu, Shu Xin Yang

    科研成果: 期刊稿件文章同行评审

    17 引用 (Scopus)

    摘要

    An in-depth research was conducted on the use of topic models to identify the topics of healthcare services. In view of semantic sparseness and the lack of co-occurrence information in the special extraction of healthcare reviews in the LDA topic model, a CO-LDA model was proposed based on word co-occurrence analysis combined with LDA topic model. Firstly, the word co-occurrence analysis method was used to analyze the corpus of the review and the word co-occurrence matrix was obtained. Secondly, the LDA topic model was used to represent corpus reviews, and then the hierarchical clustering algorithm was used to classify the features. Finally, patients' focus on healthcare service quality factors was identified. Based on the average minimum JS distance, the average Kendall correlation coefficient and the average TF-IDF, in this paper the CO-LDA model was compared with the traditional LDA model. The experiment finally shows that the recognition theme consistency of CO-LDA model is better than that of the LDA model. Through the comparison of the experimental results with the "Hospital Evaluation Standards" in China, it is found that the consistency of the former was high, which explains the effectiveness of the CO-LDA-based online medical review topic mining method.

    投稿的翻译标题Identifying Topics of Online Healthcare Reviews Based on Improved LDA
    源语言繁体中文
    页(从-至)427-434
    页数8
    期刊Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology
    39
    4
    DOI
    出版状态已出版 - 1 4月 2019

    关键词

    • CO-latent dirichlet allocation
    • Healthcare service
    • Semantic sparse
    • Topic extraction
    • Word co-occurrence analysis

    指纹

    探究 '基于改进LDA的在线医疗评论主题挖掘' 的科研主题。它们共同构成独一无二的指纹。

    引用此