Reinforcement learning with tree-LSTM for join order selection

Xiang Yu; Guoliang Li; Chengliang Chai; Nan Tang

doi:10.1109/ICDE48307.2020.00116

Reinforcement learning with tree-LSTM for join order selection

Xiang Yu, Guoliang Li^*, Chengliang Chai, Nan Tang

^*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

121 Citations (Scopus)

Abstract

Join order selection (JOS) - the problem of finding the optimal join order for an SQL query - is a primary focus of database query optimizers. The problem is hard due to its large solution space. Exhaustively traversing the solution space is prohibitively expensive, which is often combined with heuristic pruning. Despite decades-long effort, traditional optimizers still suffer from low scalability or low accuracy when handling complicated SQL queries. Recent attempts using deep reinforcement learning (DRL), by encoding join trees with fixed-length handtuned feature vectors, have shed some light on JOS. However, using fixed-length feature vectors cannot capture the structural information of a join tree, which may produce poor join plans. Moreover, it may also cause retraining the neural network when handling schema changes (e.g., adding tables/columns) or multialias table names that are common in SQL queries.In this paper, we present RTOS, a novel learned optimizer that uses Reinforcement learning with Tree-structured long short-term memory (LSTM) for join Order Selection. RTOS improves existing DRL-based approaches in two main aspects: (1) it adopts graph neural networks to capture the structures of join trees; and (2) it well supports the modification of database schema and multi-alias table names. Extensive experiments on Join Order Benchmark (JOB) and TPC-H show that RTOS outperforms traditional optimizers and existing DRL-based learned optimizers. In particular, the plan RTOS generated for JOB is 101% on (estimated) cost and 67% on latency (i.e., execution time) on average, compared with dynamic programming that is known to produce the state-of-the-art results on join plans.

Original language	English
Title of host publication	Proceedings - 2020 IEEE 36th International Conference on Data Engineering, ICDE 2020
Publisher	IEEE Computer Society
Pages	1297-1308
Number of pages	12
ISBN (Electronic)	9781728129037
DOIs	https://doi.org/10.1109/ICDE48307.2020.00116
Publication status	Published - Apr 2020
Externally published	Yes
Event	36th IEEE International Conference on Data Engineering, ICDE 2020 - Dallas, United States Duration: 20 Apr 2020 → 24 Apr 2020

Publication series

Name	Proceedings - International Conference on Data Engineering
Volume	2020-April
ISSN (Print)	1084-4627

Conference

Conference	36th IEEE International Conference on Data Engineering, ICDE 2020
Country/Territory	United States
City	Dallas
Period	20/04/20 → 24/04/20

Access to Document

10.1109/ICDE48307.2020.00116

Cite this

Yu, X., Li, G., Chai, C., & Tang, N. (2020). Reinforcement learning with tree-LSTM for join order selection. In Proceedings - 2020 IEEE 36th International Conference on Data Engineering, ICDE 2020 (pp. 1297-1308). Article 9101694 (Proceedings - International Conference on Data Engineering; Vol. 2020-April). IEEE Computer Society. https://doi.org/10.1109/ICDE48307.2020.00116

@inproceedings{0cc72462111643cb899bd93b47928726,

title = "Reinforcement learning with tree-LSTM for join order selection",

abstract = "Join order selection (JOS) - the problem of finding the optimal join order for an SQL query - is a primary focus of database query optimizers. The problem is hard due to its large solution space. Exhaustively traversing the solution space is prohibitively expensive, which is often combined with heuristic pruning. Despite decades-long effort, traditional optimizers still suffer from low scalability or low accuracy when handling complicated SQL queries. Recent attempts using deep reinforcement learning (DRL), by encoding join trees with fixed-length handtuned feature vectors, have shed some light on JOS. However, using fixed-length feature vectors cannot capture the structural information of a join tree, which may produce poor join plans. Moreover, it may also cause retraining the neural network when handling schema changes (e.g., adding tables/columns) or multialias table names that are common in SQL queries.In this paper, we present RTOS, a novel learned optimizer that uses Reinforcement learning with Tree-structured long short-term memory (LSTM) for join Order Selection. RTOS improves existing DRL-based approaches in two main aspects: (1) it adopts graph neural networks to capture the structures of join trees; and (2) it well supports the modification of database schema and multi-alias table names. Extensive experiments on Join Order Benchmark (JOB) and TPC-H show that RTOS outperforms traditional optimizers and existing DRL-based learned optimizers. In particular, the plan RTOS generated for JOB is 101% on (estimated) cost and 67% on latency (i.e., execution time) on average, compared with dynamic programming that is known to produce the state-of-the-art results on join plans.",

author = "Xiang Yu and Guoliang Li and Chengliang Chai and Nan Tang",

note = "Publisher Copyright: {\textcopyright} 2020 IEEE.; 36th IEEE International Conference on Data Engineering, ICDE 2020 ; Conference date: 20-04-2020 Through 24-04-2020",

year = "2020",

month = apr,

doi = "10.1109/ICDE48307.2020.00116",

language = "English",

series = "Proceedings - International Conference on Data Engineering",

publisher = "IEEE Computer Society",

pages = "1297--1308",

booktitle = "Proceedings - 2020 IEEE 36th International Conference on Data Engineering, ICDE 2020",

address = "United States",

}

Yu, X, Li, G, Chai, C & Tang, N 2020, Reinforcement learning with tree-LSTM for join order selection. in Proceedings - 2020 IEEE 36th International Conference on Data Engineering, ICDE 2020., 9101694, Proceedings - International Conference on Data Engineering, vol. 2020-April, IEEE Computer Society, pp. 1297-1308, 36th IEEE International Conference on Data Engineering, ICDE 2020, Dallas, United States, 20/04/20. https://doi.org/10.1109/ICDE48307.2020.00116

Reinforcement learning with tree-LSTM for join order selection. / Yu, Xiang; Li, Guoliang; Chai, Chengliang et al.
Proceedings - 2020 IEEE 36th International Conference on Data Engineering, ICDE 2020. IEEE Computer Society, 2020. p. 1297-1308 9101694 (Proceedings - International Conference on Data Engineering; Vol. 2020-April).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Reinforcement learning with tree-LSTM for join order selection

AU - Yu, Xiang

AU - Li, Guoliang

AU - Chai, Chengliang

AU - Tang, Nan

PY - 2020/4

Y1 - 2020/4

N2 - Join order selection (JOS) - the problem of finding the optimal join order for an SQL query - is a primary focus of database query optimizers. The problem is hard due to its large solution space. Exhaustively traversing the solution space is prohibitively expensive, which is often combined with heuristic pruning. Despite decades-long effort, traditional optimizers still suffer from low scalability or low accuracy when handling complicated SQL queries. Recent attempts using deep reinforcement learning (DRL), by encoding join trees with fixed-length handtuned feature vectors, have shed some light on JOS. However, using fixed-length feature vectors cannot capture the structural information of a join tree, which may produce poor join plans. Moreover, it may also cause retraining the neural network when handling schema changes (e.g., adding tables/columns) or multialias table names that are common in SQL queries.In this paper, we present RTOS, a novel learned optimizer that uses Reinforcement learning with Tree-structured long short-term memory (LSTM) for join Order Selection. RTOS improves existing DRL-based approaches in two main aspects: (1) it adopts graph neural networks to capture the structures of join trees; and (2) it well supports the modification of database schema and multi-alias table names. Extensive experiments on Join Order Benchmark (JOB) and TPC-H show that RTOS outperforms traditional optimizers and existing DRL-based learned optimizers. In particular, the plan RTOS generated for JOB is 101% on (estimated) cost and 67% on latency (i.e., execution time) on average, compared with dynamic programming that is known to produce the state-of-the-art results on join plans.

AB - Join order selection (JOS) - the problem of finding the optimal join order for an SQL query - is a primary focus of database query optimizers. The problem is hard due to its large solution space. Exhaustively traversing the solution space is prohibitively expensive, which is often combined with heuristic pruning. Despite decades-long effort, traditional optimizers still suffer from low scalability or low accuracy when handling complicated SQL queries. Recent attempts using deep reinforcement learning (DRL), by encoding join trees with fixed-length handtuned feature vectors, have shed some light on JOS. However, using fixed-length feature vectors cannot capture the structural information of a join tree, which may produce poor join plans. Moreover, it may also cause retraining the neural network when handling schema changes (e.g., adding tables/columns) or multialias table names that are common in SQL queries.In this paper, we present RTOS, a novel learned optimizer that uses Reinforcement learning with Tree-structured long short-term memory (LSTM) for join Order Selection. RTOS improves existing DRL-based approaches in two main aspects: (1) it adopts graph neural networks to capture the structures of join trees; and (2) it well supports the modification of database schema and multi-alias table names. Extensive experiments on Join Order Benchmark (JOB) and TPC-H show that RTOS outperforms traditional optimizers and existing DRL-based learned optimizers. In particular, the plan RTOS generated for JOB is 101% on (estimated) cost and 67% on latency (i.e., execution time) on average, compared with dynamic programming that is known to produce the state-of-the-art results on join plans.

UR - http://www.scopus.com/inward/record.url?scp=85083025070&partnerID=8YFLogxK

U2 - 10.1109/ICDE48307.2020.00116

DO - 10.1109/ICDE48307.2020.00116

M3 - Conference contribution

AN - SCOPUS:85083025070

T3 - Proceedings - International Conference on Data Engineering

SP - 1297

EP - 1308

BT - Proceedings - 2020 IEEE 36th International Conference on Data Engineering, ICDE 2020

PB - IEEE Computer Society

T2 - 36th IEEE International Conference on Data Engineering, ICDE 2020

Y2 - 20 April 2020 through 24 April 2020

ER -

Reinforcement learning with tree-LSTM for join order selection

Abstract

Publication series

Conference

Access to Document

Other files and links

Fingerprint

Cite this