RW-Tree: A Learned Workload-aware Framework for R-tree Construction

Haowen Dong, Chengliang Chai*, Yuyu Luo, Jiabin Liu, Jianhua Feng, Chaoqun Zhan

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

10 引用 (Scopus)

摘要

R-tree is a popular index which supports efficient queries on multi-dimensional data. The performance of R-tree mostly depends on how the tree structure is built if new data instances are inserted, which has been studied for years. Existing works can be categorized into two groups. One is the bulk-loading approaches that insert data instances in batch, but they cannot support real-time insertion. Hence, our focus is on the other one that inserts each data instance individually, and thus fresh data can be instantly queried. However, existing methods do not consider the workload information, which leads to limited potential optimization opportunity. Therefore, it is important to study workload-aware R-tree construction for efficient multi-dimensional data access. There are several challenges. First, how to represent the query workload is a challenge. Second, given a workload, it is challenging to accurately measure the benefit of a data insertion choice. Third, both range queries and kNN queries should be considered in the workload. To address these challenges, we propose a novel framework that leverages a learning-based method to solve the workload-aware R-tree construction problem. First, by extracting the query workload features, we learn a distribution for the workload using the space partition. Second, considering the distribution, we design a cost model to describe the benefits (i.e., query execution time) of different insertion choices and select the best one. Third, we convert the kNN queries to range search ones, so as to support the workload including both types of queries. Experimental results show that on OpenStreetMap real datasets, compared with baselines, we improve the query efficiency by 1.17x.

源语言英语
主期刊名Proceedings - 2022 IEEE 38th International Conference on Data Engineering, ICDE 2022
出版商IEEE Computer Society
2073-2085
页数13
ISBN(电子版)9781665408837
DOI
出版状态已出版 - 2022
已对外发布
活动38th IEEE International Conference on Data Engineering, ICDE 2022 - Virtual, Online, 马来西亚
期限: 9 5月 202212 5月 2022

出版系列

姓名Proceedings - International Conference on Data Engineering
2022-May
ISSN(印刷版)1084-4627

会议

会议38th IEEE International Conference on Data Engineering, ICDE 2022
国家/地区马来西亚
Virtual, Online
时期9/05/2212/05/22

指纹

探究 'RW-Tree: A Learned Workload-aware Framework for R-tree Construction' 的科研主题。它们共同构成独一无二的指纹。

引用此