Reinforcement learning with tree-LSTM for join order selection

Xiang Yu; Guoliang Li; Chengliang Chai; Nan Tang

doi:10.1109/ICDE48307.2020.00116

Reinforcement learning with tree-LSTM for join order selection

Xiang Yu, Guoliang Li^*, Chengliang Chai, Nan Tang

^*此作品的通讯作者

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

127 引用（Scopus）

摘要

Join order selection (JOS) - the problem of finding the optimal join order for an SQL query - is a primary focus of database query optimizers. The problem is hard due to its large solution space. Exhaustively traversing the solution space is prohibitively expensive, which is often combined with heuristic pruning. Despite decades-long effort, traditional optimizers still suffer from low scalability or low accuracy when handling complicated SQL queries. Recent attempts using deep reinforcement learning (DRL), by encoding join trees with fixed-length handtuned feature vectors, have shed some light on JOS. However, using fixed-length feature vectors cannot capture the structural information of a join tree, which may produce poor join plans. Moreover, it may also cause retraining the neural network when handling schema changes (e.g., adding tables/columns) or multialias table names that are common in SQL queries.In this paper, we present RTOS, a novel learned optimizer that uses Reinforcement learning with Tree-structured long short-term memory (LSTM) for join Order Selection. RTOS improves existing DRL-based approaches in two main aspects: (1) it adopts graph neural networks to capture the structures of join trees; and (2) it well supports the modification of database schema and multi-alias table names. Extensive experiments on Join Order Benchmark (JOB) and TPC-H show that RTOS outperforms traditional optimizers and existing DRL-based learned optimizers. In particular, the plan RTOS generated for JOB is 101% on (estimated) cost and 67% on latency (i.e., execution time) on average, compared with dynamic programming that is known to produce the state-of-the-art results on join plans.

源语言	英语
主期刊名	Proceedings - 2020 IEEE 36th International Conference on Data Engineering, ICDE 2020
出版商	IEEE Computer Society
页	1297-1308
页数	12
ISBN（电子版）	9781728129037
DOI	https://doi.org/10.1109/ICDE48307.2020.00116
出版状态	已出版 - 4月 2020
已对外发布	是
活动	36th IEEE International Conference on Data Engineering, ICDE 2020 - Dallas, 美国期限: 20 4月 2020 → 24 4月 2020

出版系列

姓名	Proceedings - International Conference on Data Engineering
卷	2020-April
ISSN（印刷版）	1084-4627

会议

会议	36th IEEE International Conference on Data Engineering, ICDE 2020
国家/地区	美国
市	Dallas
时期	20/04/20 → 24/04/20

访问文件

10.1109/ICDE48307.2020.00116

其它文件与链接

链接到 Scopus 的出版物

引用此

Yu, X., Li, G., Chai, C., & Tang, N. (2020). Reinforcement learning with tree-LSTM for join order selection. 在 Proceedings - 2020 IEEE 36th International Conference on Data Engineering, ICDE 2020 (页码 1297-1308). 文章 9101694 (Proceedings - International Conference on Data Engineering; 卷 2020-April). IEEE Computer Society. https://doi.org/10.1109/ICDE48307.2020.00116

@inproceedings{0cc72462111643cb899bd93b47928726,

title = "Reinforcement learning with tree-LSTM for join order selection",

abstract = "Join order selection (JOS) - the problem of finding the optimal join order for an SQL query - is a primary focus of database query optimizers. The problem is hard due to its large solution space. Exhaustively traversing the solution space is prohibitively expensive, which is often combined with heuristic pruning. Despite decades-long effort, traditional optimizers still suffer from low scalability or low accuracy when handling complicated SQL queries. Recent attempts using deep reinforcement learning (DRL), by encoding join trees with fixed-length handtuned feature vectors, have shed some light on JOS. However, using fixed-length feature vectors cannot capture the structural information of a join tree, which may produce poor join plans. Moreover, it may also cause retraining the neural network when handling schema changes (e.g., adding tables/columns) or multialias table names that are common in SQL queries.In this paper, we present RTOS, a novel learned optimizer that uses Reinforcement learning with Tree-structured long short-term memory (LSTM) for join Order Selection. RTOS improves existing DRL-based approaches in two main aspects: (1) it adopts graph neural networks to capture the structures of join trees; and (2) it well supports the modification of database schema and multi-alias table names. Extensive experiments on Join Order Benchmark (JOB) and TPC-H show that RTOS outperforms traditional optimizers and existing DRL-based learned optimizers. In particular, the plan RTOS generated for JOB is 101% on (estimated) cost and 67% on latency (i.e., execution time) on average, compared with dynamic programming that is known to produce the state-of-the-art results on join plans.",

author = "Xiang Yu and Guoliang Li and Chengliang Chai and Nan Tang",

note = "Publisher Copyright: {\textcopyright} 2020 IEEE.; 36th IEEE International Conference on Data Engineering, ICDE 2020 ; Conference date: 20-04-2020 Through 24-04-2020",

year = "2020",

month = apr,

doi = "10.1109/ICDE48307.2020.00116",

language = "English",

series = "Proceedings - International Conference on Data Engineering",

publisher = "IEEE Computer Society",

pages = "1297--1308",

booktitle = "Proceedings - 2020 IEEE 36th International Conference on Data Engineering, ICDE 2020",

address = "United States",

}

Yu, X, Li, G, Chai, C & Tang, N 2020, Reinforcement learning with tree-LSTM for join order selection. 在 Proceedings - 2020 IEEE 36th International Conference on Data Engineering, ICDE 2020., 9101694, Proceedings - International Conference on Data Engineering, 卷 2020-April, IEEE Computer Society, 页码 1297-1308, 36th IEEE International Conference on Data Engineering, ICDE 2020, Dallas, 美国, 20/04/20. https://doi.org/10.1109/ICDE48307.2020.00116

Reinforcement learning with tree-LSTM for join order selection. / Yu, Xiang; Li, Guoliang; Chai, Chengliang 等.
Proceedings - 2020 IEEE 36th International Conference on Data Engineering, ICDE 2020. IEEE Computer Society, 2020. 页码 1297-1308 9101694 (Proceedings - International Conference on Data Engineering; 卷 2020-April).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - Reinforcement learning with tree-LSTM for join order selection

AU - Yu, Xiang

AU - Li, Guoliang

AU - Chai, Chengliang

AU - Tang, Nan

PY - 2020/4

Y1 - 2020/4

N2 - Join order selection (JOS) - the problem of finding the optimal join order for an SQL query - is a primary focus of database query optimizers. The problem is hard due to its large solution space. Exhaustively traversing the solution space is prohibitively expensive, which is often combined with heuristic pruning. Despite decades-long effort, traditional optimizers still suffer from low scalability or low accuracy when handling complicated SQL queries. Recent attempts using deep reinforcement learning (DRL), by encoding join trees with fixed-length handtuned feature vectors, have shed some light on JOS. However, using fixed-length feature vectors cannot capture the structural information of a join tree, which may produce poor join plans. Moreover, it may also cause retraining the neural network when handling schema changes (e.g., adding tables/columns) or multialias table names that are common in SQL queries.In this paper, we present RTOS, a novel learned optimizer that uses Reinforcement learning with Tree-structured long short-term memory (LSTM) for join Order Selection. RTOS improves existing DRL-based approaches in two main aspects: (1) it adopts graph neural networks to capture the structures of join trees; and (2) it well supports the modification of database schema and multi-alias table names. Extensive experiments on Join Order Benchmark (JOB) and TPC-H show that RTOS outperforms traditional optimizers and existing DRL-based learned optimizers. In particular, the plan RTOS generated for JOB is 101% on (estimated) cost and 67% on latency (i.e., execution time) on average, compared with dynamic programming that is known to produce the state-of-the-art results on join plans.

AB - Join order selection (JOS) - the problem of finding the optimal join order for an SQL query - is a primary focus of database query optimizers. The problem is hard due to its large solution space. Exhaustively traversing the solution space is prohibitively expensive, which is often combined with heuristic pruning. Despite decades-long effort, traditional optimizers still suffer from low scalability or low accuracy when handling complicated SQL queries. Recent attempts using deep reinforcement learning (DRL), by encoding join trees with fixed-length handtuned feature vectors, have shed some light on JOS. However, using fixed-length feature vectors cannot capture the structural information of a join tree, which may produce poor join plans. Moreover, it may also cause retraining the neural network when handling schema changes (e.g., adding tables/columns) or multialias table names that are common in SQL queries.In this paper, we present RTOS, a novel learned optimizer that uses Reinforcement learning with Tree-structured long short-term memory (LSTM) for join Order Selection. RTOS improves existing DRL-based approaches in two main aspects: (1) it adopts graph neural networks to capture the structures of join trees; and (2) it well supports the modification of database schema and multi-alias table names. Extensive experiments on Join Order Benchmark (JOB) and TPC-H show that RTOS outperforms traditional optimizers and existing DRL-based learned optimizers. In particular, the plan RTOS generated for JOB is 101% on (estimated) cost and 67% on latency (i.e., execution time) on average, compared with dynamic programming that is known to produce the state-of-the-art results on join plans.

UR - http://www.scopus.com/inward/record.url?scp=85083025070&partnerID=8YFLogxK

U2 - 10.1109/ICDE48307.2020.00116

DO - 10.1109/ICDE48307.2020.00116

M3 - Conference contribution

AN - SCOPUS:85083025070

T3 - Proceedings - International Conference on Data Engineering

SP - 1297

EP - 1308

BT - Proceedings - 2020 IEEE 36th International Conference on Data Engineering, ICDE 2020

PB - IEEE Computer Society

T2 - 36th IEEE International Conference on Data Engineering, ICDE 2020

Y2 - 20 April 2020 through 24 April 2020

ER -

Reinforcement learning with tree-LSTM for join order selection

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此