Adverse drug reaction detection on social media with deep linguistic features

Ying Zhang*, Shaoze Cui, Huiying Gao

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

31 引用 (Scopus)

摘要

Adverse reactions caused by drugs are one of the most important public health problems. Social media has encouraged more patients to share their drug use experiences and has become a major source for the detection of professionally unreported adverse drug reactions (ADRs). Since a large number of user posts do not mention any ADR, accurate detection of the presence of ADRs in each user post is necessary before further research can be conducted. Previous feature-based methods focus on extracting more shallow linguistic features that are unable to capture deep and subtle information in the context, ultimately failing to provide satisfactory accuracy. To overcome the limitations of previous studies, this paper proposes a novel method that can extract deep linguistic features and then combine them with shallow linguistic features for ADR detection. We first extract predicate-ADR pairs under the guidance of extended syntactic dependencies and ADR lexicon. Then, we extract semantic and part-of-speech (POS) features for each pair and pool the features of different pairs to generate a holistic representation of deep linguistic features. Finally, we use the collection of deep features and several shallow features to train the predictive models. A series of experiments are performed on data sets collected from DailyStrength and Twitter. Our approach can achieve AUCs of 94.44% and 88.97% on the two data sets, respectively, outperforming other state-of-the-art methods. The results demonstrate the potential benefits of deep linguistic features for ADR detection on social data. This method can be applied to multiple other healthcare and text analysis tasks and can be used to support pharmacovigilance research.

源语言英语
文章编号103437
期刊Journal of Biomedical Informatics
106
DOI
出版状态已出版 - 6月 2020

指纹

探究 'Adverse drug reaction detection on social media with deep linguistic features' 的科研主题。它们共同构成独一无二的指纹。

引用此