TY - JOUR
T1 - Capturing semantic hierarchies to perform meaningful integration in HTML tables
AU - Li, Shijun
AU - Liu, Mengchi
AU - Wang, Guoren
AU - Peng, Zhiyong
PY - 2004
Y1 - 2004
N2 - We present a new approach that automatically captures the semantic hierarchies in HTML tables, and semi-automatically integrates HTML tables belonging to a domain. It first automatically captures the attribute-value pairs in HTML tables by normalization and recognizing their headings. After generating global schema manually, it learns the lexical semantic sets and contexts, by which it then eliminates the conflicts and solves the nondeterministic problems in mapping each source schema to the global schema to integrate the data in HTML tables.
AB - We present a new approach that automatically captures the semantic hierarchies in HTML tables, and semi-automatically integrates HTML tables belonging to a domain. It first automatically captures the attribute-value pairs in HTML tables by normalization and recognizing their headings. After generating global schema manually, it learns the lexical semantic sets and contexts, by which it then eliminates the conflicts and solves the nondeterministic problems in mapping each source schema to the global schema to integrate the data in HTML tables.
UR - http://www.scopus.com/inward/record.url?scp=35048830383&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-24655-8_101
DO - 10.1007/978-3-540-24655-8_101
M3 - Article
AN - SCOPUS:35048830383
SN - 0302-9743
VL - 3007
SP - 899
EP - 902
JO - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
JF - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
ER -