Skip to main navigation Skip to search Skip to main content

CardOOD: robust query-driven cardinality estimation under out-of-distribution

  • Chinese University of Hong Kong

Research output: Contribution to journalArticlepeer-review

Abstract

Query-driven learned estimators are accurate, flexible, and lightweight alternatives to traditional estimators in query optimization. However, existing query-driven approaches struggle with the problem of Out-of-Distribution (OOD), where the test workload distribution differs from the training workload, leading to significant performance degradation. In this paper, we present CardOOD, a modular learning framework designed to construct robust query-driven cardinality estimators that are resilient against the OOD problem. Our framework focuses on offline training algorithms that develop one-off models from a static workload, suitable for model initialization and periodic retraining. In CardOOD, we systematically adapt prevailing transfer learning and robust learning techniques, falling into three categories: representation learning, data manipulation, and new learning strategies, and instantiate them for training cardinality estimators. Beyond transferring existing techniques, we propose a novel learning algorithm, OrderEmb, tailored to the specific properties of cardinality estimation. This algorithm, lying in the category of learning strategy, exploits the partial-order constraint on query cardinalities induced by predicate containment. We provide a theoretical analysis of OrderEmb, justifying its ability to enhance representation quality by maximizing mutual information. Comprehensive experimental studies demonstrate the efficacy of the algorithms of CardOOD in mitigating the OOD problem to varying extents. We further integrate CardOOD into PostgreSQL, showcasing its practical utility in end-to-end query optimization.

Original languageEnglish
Article number28
JournalVLDB Journal
Volume35
Issue number4
DOIs
Publication statusPublished - Jul 2026

Keywords

  • Cardinality estimation
  • Out-of-distribution
  • Query optimization

Fingerprint

Dive into the research topics of 'CardOOD: robust query-driven cardinality estimation under out-of-distribution'. Together they form a unique fingerprint.

Cite this