IEKM-MD: An intelligent platform for information extraction and knowledge mining in multi-domains

Yu Li, Tao Yue, Wu Zhenxin

科研成果: 期刊稿件会议文章同行评审

1 引用 (Scopus)

摘要

The terminologies in different disciplines vary greatly, and the annotated corpora are scarce, which have limited the portability of information extraction models. The content of scientific articles is still underutilized. This paper constructs an intelligent platform for information extraction and knowledge mining, namely IEKM-MD. Two innovative technologies are proposed: Firstly, a phrase-level scientific entity extraction model combining neural network and active learning is designed, which can reduce the model's dependence on large-scale corpus. Secondly, a translation-based relation prediction model is provided, which improves the relation embeddings by optimizing loss function. In addition, the platform integrates the advanced entity recognition model (spaCy.NER) and the keyword extraction model (RAKE). It provides abundant services for fine-grained and multi-dimensional knowledge, including problem discovery, method recognition, relation representation and hot spot detection. We carried out the experiments in three different domains: Artificial Intelligence, Nanotechnology and Genetic Engineering. The average accuracies of scientific entity extraction respectively are 0.91, 0.52 and 0.76.

源语言英语
页(从-至)73-78
页数6
期刊CEUR Workshop Proceedings
2658
出版状态已出版 - 2020
已对外发布
活动1st Workshop on Extraction and Evaluation of Knowledge Entities from Scientific Documents, EEKE 2020 - Virtual, Online, 中国
期限: 1 8月 2020 → …

指纹

探究 'IEKM-MD: An intelligent platform for information extraction and knowledge mining in multi-domains' 的科研主题。它们共同构成独一无二的指纹。

引用此