Technology of web page knowledge acquisition

Si Kang Hu*, Yuan Da Cao

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

2 引用 (Scopus)

摘要

Technology of automatic Web text knowledge acquisition is described, based on pseudo-natural language understanding. Web page texts are represented first by domain grammars. The domain grammars are transformed into rules that are used to describe the sentence information and are up to regular expression regulations. Then the Web page texts are transformed into semantic triples that represent Web knowledge by those rules. The semantic triples then form the domain knowledge base. Test data showed that the average recall rate and precision rate of different kinds of Web page data in domain knowledge base is 71.5% and 79.1% separately, as have been formed by the above technology.

源语言英语
页(从-至)1065-1068
页数4
期刊Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology
26
12
出版状态已出版 - 12月 2006

指纹

探究 'Technology of web page knowledge acquisition' 的科研主题。它们共同构成独一无二的指纹。

引用此