An Effective Framework for Enhancing Query Answering in a Heterogeneous Data Lake

Qin Yuan, Ye Yuan*, Zhenyu Wen, He Wang, Shiyuan Tang

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

2 引用 (Scopus)

摘要

There has been a growing interest in cross-source searching to gain rich knowledge in recent years. A data lake collects massive raw and heterogeneous data with different data schemas and query interfaces. Many real-life applications require query answering over the heterogeneous data lake, such as e-commerce, bioinformatics and healthcare. In this paper, we propose LakeAns that semantically integrates heterogeneous data schemas of the lake to enhance the semantics of query answers. To this end, we propose a novel framework to efficiently and effectively perform the cross-source searching. The framework exploits a reinforcement learning method to semantically integrate the data schemas and further create a global relational schema for the heterogeneous data. It then performs a query answering algorithm based on the global schema to find answers across multiple data sources. We conduct extensive experimental evaluations using real-life data to verify that our approach outperforms existing solutions in terms of effectiveness and efficiency.

源语言英语
主期刊名SIGIR 2023 - Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval
出版商Association for Computing Machinery, Inc
770-780
页数11
ISBN(电子版)9781450394086
DOI
出版状态已出版 - 19 7月 2023
活动46th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2023 - Taipei, 中国台湾
期限: 23 7月 202327 7月 2023

出版系列

姓名SIGIR 2023 - Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

会议

会议46th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2023
国家/地区中国台湾
Taipei
时期23/07/2327/07/23

指纹

探究 'An Effective Framework for Enhancing Query Answering in a Heterogeneous Data Lake' 的科研主题。它们共同构成独一无二的指纹。

引用此