TY - JOUR
T1 - 基于通用知识库的地理实体开放关系过滤方法
AU - Gao, Jialiang
AU - Yu, Li
AU - Qiu, Peiyuan
AU - Lu, Feng
N1 - Publisher Copyright:
© 2019, Science Press. All right reserved.
PY - 2019/9/25
Y1 - 2019/9/25
N2 - Knowledge Graphs (KGs) are crucial resources for supporting geographical knowledge services. Given the vast geographical knowledge in web text, extraction of geo-entity relations from web text has become the core technology for constructing geographical KGs. Furthermore, it directly affects the quality of geographical knowledge services. However, web text inevitably contains noise and geographical knowledge can be sparsely distributed, both greatly restricting the quality of geo-entity relationship extraction. Here, we proposed a method for filtering geo-entity relations based on existing Knowledge Bases (KBs). Specifically, ontology knowledge, fact knowledge, and synonym knowledge were integrated to generate geo-related knowledge. Then, the extracted geo-entity relationships and the geo-related knowledge were transferred into vectors, and the maximum similarity between vectors was the confidence value of one extracted geo-entity relationship triple. Our method takes full advantage of existing KBs to assess the quality of geographical information in web text, which helps improve the richness and freshness of geographical KGs. Compared with the Stanford OpenIE method, our method decreased the Mean Square Error (MSE) from 0.62 to 0.06 in the confidence interval [0.7, 1], and improved the area under the Receiver Operating Characteristic (ROC) Curve (AUC) from 0.51 to 0.89.
AB - Knowledge Graphs (KGs) are crucial resources for supporting geographical knowledge services. Given the vast geographical knowledge in web text, extraction of geo-entity relations from web text has become the core technology for constructing geographical KGs. Furthermore, it directly affects the quality of geographical knowledge services. However, web text inevitably contains noise and geographical knowledge can be sparsely distributed, both greatly restricting the quality of geo-entity relationship extraction. Here, we proposed a method for filtering geo-entity relations based on existing Knowledge Bases (KBs). Specifically, ontology knowledge, fact knowledge, and synonym knowledge were integrated to generate geo-related knowledge. Then, the extracted geo-entity relationships and the geo-related knowledge were transferred into vectors, and the maximum similarity between vectors was the confidence value of one extracted geo-entity relationship triple. Our method takes full advantage of existing KBs to assess the quality of geographical information in web text, which helps improve the richness and freshness of geographical KGs. Compared with the Stanford OpenIE method, our method decreased the Mean Square Error (MSE) from 0.62 to 0.06 in the confidence interval [0.7, 1], and improved the area under the Receiver Operating Characteristic (ROC) Curve (AUC) from 0.51 to 0.89.
KW - Common knowledge bases
KW - Evaluation of geographic information quality
KW - Geo-KG building
KW - Geo-entity relations extraction
KW - Information filtering
KW - Open relation extraction
KW - Text data
UR - http://www.scopus.com/inward/record.url?scp=85089242706&partnerID=8YFLogxK
U2 - 10.12082/dqxxkx.2019.190005
DO - 10.12082/dqxxkx.2019.190005
M3 - 文章
AN - SCOPUS:85089242706
SN - 1560-8999
VL - 21
SP - 1392
EP - 1401
JO - Journal of Geo-Information Science
JF - Journal of Geo-Information Science
IS - 9
ER -