A Learned Query Rewrite System using Monte Carlo Tree Search

Xuanhe Zhou; Guoliang Li; Chengliang Chai; Jianhua Feng

doi:10.14778/3485450.3485456

A Learned Query Rewrite System using Monte Carlo Tree Search

Xuanhe Zhou, Guoliang Li^*, Chengliang Chai, Jianhua Feng

^*Corresponding author for this work

Tsinghua University

Research output: Contribution to journal › Conference article › peer-review

36 Citations (Scopus)

Abstract

Query rewrite transforms a SQL query into an equivalent one but with higher performance. However, SQL rewrite is an NP-hard problem, and existing approaches adopt heuristics to rewrite the queries. These heuristics have two main limitations. First, the order of applying different rewrite rules significantly affects the query performance. However, the search space of all possible rewrite orders grows exponentially with the number of query operators and rules and it is rather hard to find the optimal rewrite order. Existing methods apply a pre-defined order to rewrite queries and will fall in a local optimum. Second, different rewrite rules have different benefits for different queries. Existing methods work on single plans but cannot effectively estimate the benefits of rewriting a query. To address these challenges, we propose a policy tree based query rewrite framework, where the root is the input query and each node is a rewritten query from its parent. We aim to explore the tree nodes in the policy tree to find the optimal rewrite query. We propose to use Monte Carlo Tree Search to explore the policy tree, which navigates the policy tree to efficiently get the optimal node. Moreover, we propose a learning-based model to estimate the expected performance improvement of each rewritten query, which guides the tree search more accurately. We also propose a parallel algorithm that can explore the tree search in parallel in order to improve the performance. Experimental results showed that our method significantly outperformed existing approaches.

Original language	English
Pages (from-to)	46-58
Number of pages	13
Journal	Proceedings of the VLDB Endowment
Volume	15
Issue number	1
DOIs	https://doi.org/10.14778/3485450.3485456
Publication status	Published - 2021
Externally published	Yes
Event	48th International Conference on Very Large Data Bases, VLDB 2022 - Sydney, Australia Duration: 5 Sept 2022 → 9 Sept 2022

Access to Document

10.14778/3485450.3485456

Cite this

Zhou, X., Li, G., Chai, C., & Feng, J. (2021). A Learned Query Rewrite System using Monte Carlo Tree Search. Proceedings of the VLDB Endowment, 15(1), 46-58. https://doi.org/10.14778/3485450.3485456

@article{6eed5014d6ad4482be33422d1c063f45,

title = "A Learned Query Rewrite System using Monte Carlo Tree Search",

abstract = "Query rewrite transforms a SQL query into an equivalent one but with higher performance. However, SQL rewrite is an NP-hard problem, and existing approaches adopt heuristics to rewrite the queries. These heuristics have two main limitations. First, the order of applying different rewrite rules significantly affects the query performance. However, the search space of all possible rewrite orders grows exponentially with the number of query operators and rules and it is rather hard to find the optimal rewrite order. Existing methods apply a pre-defined order to rewrite queries and will fall in a local optimum. Second, different rewrite rules have different benefits for different queries. Existing methods work on single plans but cannot effectively estimate the benefits of rewriting a query. To address these challenges, we propose a policy tree based query rewrite framework, where the root is the input query and each node is a rewritten query from its parent. We aim to explore the tree nodes in the policy tree to find the optimal rewrite query. We propose to use Monte Carlo Tree Search to explore the policy tree, which navigates the policy tree to efficiently get the optimal node. Moreover, we propose a learning-based model to estimate the expected performance improvement of each rewritten query, which guides the tree search more accurately. We also propose a parallel algorithm that can explore the tree search in parallel in order to improve the performance. Experimental results showed that our method significantly outperformed existing approaches.",

author = "Xuanhe Zhou and Guoliang Li and Chengliang Chai and Jianhua Feng",

note = "Publisher Copyright: {\textcopyright} 2021, VLDB Endowment. All rights reserved.; 48th International Conference on Very Large Data Bases, VLDB 2022 ; Conference date: 05-09-2022 Through 09-09-2022",

year = "2021",

doi = "10.14778/3485450.3485456",

language = "English",

volume = "15",

pages = "46--58",

journal = "Proceedings of the VLDB Endowment",

issn = "2150-8097",

publisher = "Very Large Data Base Endowment Inc.",

number = "1",

}

TY - JOUR

T1 - A Learned Query Rewrite System using Monte Carlo Tree Search

AU - Zhou, Xuanhe

AU - Li, Guoliang

AU - Chai, Chengliang

AU - Feng, Jianhua

PY - 2021

Y1 - 2021

N2 - Query rewrite transforms a SQL query into an equivalent one but with higher performance. However, SQL rewrite is an NP-hard problem, and existing approaches adopt heuristics to rewrite the queries. These heuristics have two main limitations. First, the order of applying different rewrite rules significantly affects the query performance. However, the search space of all possible rewrite orders grows exponentially with the number of query operators and rules and it is rather hard to find the optimal rewrite order. Existing methods apply a pre-defined order to rewrite queries and will fall in a local optimum. Second, different rewrite rules have different benefits for different queries. Existing methods work on single plans but cannot effectively estimate the benefits of rewriting a query. To address these challenges, we propose a policy tree based query rewrite framework, where the root is the input query and each node is a rewritten query from its parent. We aim to explore the tree nodes in the policy tree to find the optimal rewrite query. We propose to use Monte Carlo Tree Search to explore the policy tree, which navigates the policy tree to efficiently get the optimal node. Moreover, we propose a learning-based model to estimate the expected performance improvement of each rewritten query, which guides the tree search more accurately. We also propose a parallel algorithm that can explore the tree search in parallel in order to improve the performance. Experimental results showed that our method significantly outperformed existing approaches.

AB - Query rewrite transforms a SQL query into an equivalent one but with higher performance. However, SQL rewrite is an NP-hard problem, and existing approaches adopt heuristics to rewrite the queries. These heuristics have two main limitations. First, the order of applying different rewrite rules significantly affects the query performance. However, the search space of all possible rewrite orders grows exponentially with the number of query operators and rules and it is rather hard to find the optimal rewrite order. Existing methods apply a pre-defined order to rewrite queries and will fall in a local optimum. Second, different rewrite rules have different benefits for different queries. Existing methods work on single plans but cannot effectively estimate the benefits of rewriting a query. To address these challenges, we propose a policy tree based query rewrite framework, where the root is the input query and each node is a rewritten query from its parent. We aim to explore the tree nodes in the policy tree to find the optimal rewrite query. We propose to use Monte Carlo Tree Search to explore the policy tree, which navigates the policy tree to efficiently get the optimal node. Moreover, we propose a learning-based model to estimate the expected performance improvement of each rewritten query, which guides the tree search more accurately. We also propose a parallel algorithm that can explore the tree search in parallel in order to improve the performance. Experimental results showed that our method significantly outperformed existing approaches.

UR - http://www.scopus.com/inward/record.url?scp=85126197266&partnerID=8YFLogxK

U2 - 10.14778/3485450.3485456

DO - 10.14778/3485450.3485456

M3 - Conference article

AN - SCOPUS:85126197266

SN - 2150-8097

VL - 15

SP - 46

EP - 58

JO - Proceedings of the VLDB Endowment

JF - Proceedings of the VLDB Endowment

IS - 1

T2 - 48th International Conference on Very Large Data Bases, VLDB 2022

Y2 - 5 September 2022 through 9 September 2022

ER -

A Learned Query Rewrite System using Monte Carlo Tree Search

Abstract

Access to Document

Other files and links

Fingerprint

Cite this