Lightweight and Accurate Cardinality Estimation by Neural Network Gaussian Process

Kangfei Zhao, Jeffrey Xu Yu, Zongyan He, Rui Li, Hao Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

16 Citations (Scopus)

Abstract

Deep Learning (DL) has achieved great success in many real applications. Despite its success, there are some main problems when deploying advanced DL models in database systems, such as hyper-parameters tuning, the risk of overfitting, and lack of prediction uncertainty. In this paper, we study a lightweight and accurate cardinality estimation for SQL queries, which is also uncertainty-aware. By lightweight, we mean that we can train a DL model in a few seconds. With uncertainty ensured,it becomes possible to update the estimator to improve its prediction in areas with high uncertainty.The approach we explore is different from the direction of deploying sophisticated DL models as cardinality estimators in database systems. We employ Bayesian deep learning (BDL), which serves as a bridge between Bayesian inference and deep learning. The prediction distribution by BDL provides principled uncertainty calibration for the prediction. In addition, when the network width of a BDL model goes to infinity, the model performs equivalent to Gaussian Process (GP). This special class of BDL, known as Neural Network Gaussian Process (NNGP), inherits the advantages of Bayesian approach while keeping universal approximation of neural networks, and can utilize a much larger model space to model distribution-free data as a nonparametric model. We show our NNGP estimator achieves high accuracy, is built fast, and is robust to query workload shift, in our extensive performance studies by comparing with existing learned estimators. We also confirm the effectiveness of NNGP by integrating it into PostgreSQL.

Original languageEnglish
Title of host publicationSIGMOD 2022 - Proceedings of the 2022 International Conference on Management of Data
PublisherAssociation for Computing Machinery
Pages973-987
Number of pages15
ISBN (Electronic)9781450392495
DOIs
Publication statusPublished - 10 Jun 2022
Externally publishedYes
Event2022 ACM SIGMOD International Conference on the Management of Data, SIGMOD 2022 - Virtual, Online, United States
Duration: 12 Jun 202217 Jun 2022

Publication series

NameProceedings of the ACM SIGMOD International Conference on Management of Data
ISSN (Print)0730-8078

Conference

Conference2022 ACM SIGMOD International Conference on the Management of Data, SIGMOD 2022
Country/TerritoryUnited States
CityVirtual, Online
Period12/06/2217/06/22

Keywords

  • Gaussian process
  • cardinality estimation
  • machine learning

Fingerprint

Dive into the research topics of 'Lightweight and Accurate Cardinality Estimation by Neural Network Gaussian Process'. Together they form a unique fingerprint.

Cite this