摘要
Technology of automatic Web text knowledge acquisition is described, based on pseudo-natural language understanding. Web page texts are represented first by domain grammars. The domain grammars are transformed into rules that are used to describe the sentence information and are up to regular expression regulations. Then the Web page texts are transformed into semantic triples that represent Web knowledge by those rules. The semantic triples then form the domain knowledge base. Test data showed that the average recall rate and precision rate of different kinds of Web page data in domain knowledge base is 71.5% and 79.1% separately, as have been formed by the above technology.
源语言 | 英语 |
---|---|
页(从-至) | 1065-1068 |
页数 | 4 |
期刊 | Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology |
卷 | 26 |
期 | 12 |
出版状态 | 已出版 - 12月 2006 |