TY - GEN
T1 - Automatic multi-schema integration based on user preference
AU - Ding, Guohui
AU - Wang, Guoren
AU - Xin, Junchang
AU - Geng, Huichao
PY - 2010
Y1 - 2010
N2 - Schema integration plays a central role in numerous database applications, such as Deep Web, DataSpaces and Ontology Merging. Although there have been many researches on schema integration, they all neglect user preference which is a very important factor for improving the quality of mediated schemas. In this paper, we propose the automatic multi-schema integration based on user preference. A new concept named reference schema is introduced to represent user preference. This concept can guide the process of integration to generate mediated schemas according to user preference. Different from previous solutions, our approach employs F-measure and "attribute density" to measure the similarity between schemas. Based on this similarity, we design a top-k ranking algorithm that retrieves k mediate schemas which users really expect. The key component of the algorithm is a pruning strategy which makes use of Divide and Conquer to narrow down the search space of the candidate schemas. Finally, the experimental study demonstrates the effectiveness and good performance of our approach.
AB - Schema integration plays a central role in numerous database applications, such as Deep Web, DataSpaces and Ontology Merging. Although there have been many researches on schema integration, they all neglect user preference which is a very important factor for improving the quality of mediated schemas. In this paper, we propose the automatic multi-schema integration based on user preference. A new concept named reference schema is introduced to represent user preference. This concept can guide the process of integration to generate mediated schemas according to user preference. Different from previous solutions, our approach employs F-measure and "attribute density" to measure the similarity between schemas. Based on this similarity, we design a top-k ranking algorithm that retrieves k mediate schemas which users really expect. The key component of the algorithm is a pruning strategy which makes use of Divide and Conquer to narrow down the search space of the candidate schemas. Finally, the experimental study demonstrates the effectiveness and good performance of our approach.
UR - http://www.scopus.com/inward/record.url?scp=77955016743&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-14246-8_67
DO - 10.1007/978-3-642-14246-8_67
M3 - Conference contribution
AN - SCOPUS:77955016743
SN - 3642142451
SN - 9783642142451
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 704
EP - 716
BT - Web-Age Information Management - 11th International Conference, WAIM 2010, Proceedings
T2 - 11th International Conference on Web-Age Information Management, WAIM 2010
Y2 - 15 July 2010 through 17 July 2010
ER -