Top-K generation of mediated schemas over multiple data sources

Guohui Ding*, Guoren Wang, Bin Wang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Citations (Scopus)

Abstract

Schema integration has been widely used in many database applications, such as Data Warehousing, Life Science and Ontology Merging. Though schema integration has been intensively studied in recent yeas, it is still a challenging issue, because it is almost impossible to find the perfect target schema. An automatic method to schema integration, which explores multiple possible integrated schemas over a set of source schemas from the same domain, is proposed in this paper. Firstly, the concept graph is introduced to represent the source schemas at a higher-level of abstraction. Secondly, we divide the similarity between concepts into intervals to generate three merging strategies for schemas. Finally, we design a novel top-k ranking algorithm for the automatic generation of the best candidate mediated schemas. The key component of our algorithm is the pruning technique which uses the ordered buffer and the threshold to filter out the candidates. The extensive experimental studies show that our algorithm is effective and runs in polynomial time.

Original languageEnglish
Title of host publicationDatabase Systems for Advanced Applications - 15th International Conference, DASFAA 2010, International Workshops
Subtitle of host publicationGDM, BenchmarX, MCIS, SNSMW, DIEW, UDM, Revised Selected Papers
Pages143-155
Number of pages13
DOIs
Publication statusPublished - 2010
Externally publishedYes
Event15th International Conference on Database Systems for Advanced Applications, DASFAA 2010 - Tsukuba, Japan
Duration: 1 Apr 20104 Apr 2010

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume6193 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference15th International Conference on Database Systems for Advanced Applications, DASFAA 2010
Country/TerritoryJapan
CityTsukuba
Period1/04/104/04/10

Fingerprint

Dive into the research topics of 'Top-K generation of mediated schemas over multiple data sources'. Together they form a unique fingerprint.

Cite this