TY - GEN
T1 - AutoCE
T2 - 39th IEEE International Conference on Data Engineering, ICDE 2023
AU - Zhang, Jintao
AU - Zhang, Chao
AU - Li, Guoliang
AU - Chai, Chengliang
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Cardinality estimation (CE) plays a crucial role in many database-related tasks such as query generation, cost estimation, and join ordering. Lately, we have witnessed the emergence of numerous learned CE models. However, no single CE model is invincible when it comes to the datasets with various data distributions. To facilitate data-intensive applications with accurate and efficient cardinality estimation, it is important to have an approach that can judiciously and efficiently select the most suitable CE model for an arbitrary dataset.In this paper, we study a new problem of selecting the best CE models for a variety of datasets. This problem is rather challenging as it is hard to capture the relationship from various datasets to the performance of disparate models. To address this problem, we propose a model advisor, named AutoCE, which can adaptively select the best model for a dataset. The main contribution of AutoCE is the learning-based model selection, where deep metric learning is used to learn a recommendation model and incremental learning is proposed to reduce the labeling overhead and improve the model robustness. We have integrated AutoCE into PostgreSQL and evaluated its impact on query optimization. The results showed that AutoCE achieved the best performance (27% better) and outperformed the baselines concerning accuracy (2.1x better) and efficacy (4.2x better).
AB - Cardinality estimation (CE) plays a crucial role in many database-related tasks such as query generation, cost estimation, and join ordering. Lately, we have witnessed the emergence of numerous learned CE models. However, no single CE model is invincible when it comes to the datasets with various data distributions. To facilitate data-intensive applications with accurate and efficient cardinality estimation, it is important to have an approach that can judiciously and efficiently select the most suitable CE model for an arbitrary dataset.In this paper, we study a new problem of selecting the best CE models for a variety of datasets. This problem is rather challenging as it is hard to capture the relationship from various datasets to the performance of disparate models. To address this problem, we propose a model advisor, named AutoCE, which can adaptively select the best model for a dataset. The main contribution of AutoCE is the learning-based model selection, where deep metric learning is used to learn a recommendation model and incremental learning is proposed to reduce the labeling overhead and improve the model robustness. We have integrated AutoCE into PostgreSQL and evaluated its impact on query optimization. The results showed that AutoCE achieved the best performance (27% better) and outperformed the baselines concerning accuracy (2.1x better) and efficacy (4.2x better).
UR - http://www.scopus.com/inward/record.url?scp=85167659577&partnerID=8YFLogxK
U2 - 10.1109/ICDE55515.2023.00201
DO - 10.1109/ICDE55515.2023.00201
M3 - Conference contribution
AN - SCOPUS:85167659577
T3 - Proceedings - International Conference on Data Engineering
SP - 2621
EP - 2633
BT - Proceedings - 2023 IEEE 39th International Conference on Data Engineering, ICDE 2023
PB - IEEE Computer Society
Y2 - 3 April 2023 through 7 April 2023
ER -