TY - GEN
T1 - FC
T2 - 40th IEEE International Conference on Data Engineering, ICDE 2024
AU - Pan, Hexiang
AU - Ta, Quang Trung
AU - Zhang, Meihui
AU - Zhao, Zhanhao
AU - Chee, Yeow Meng
AU - Chen, Gang
AU - Ooi, Beng Chin
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Atomic commit protocols (ACPs) are crucial for ensuring transaction atomicity in distributed transaction processing. However, existing ACPs, designed specifically for fixed failure conditions, cannot work efficiently in modern environments, where failures such as node crashes and connection delays can happen anytime due to the use of commodity nodes and networks. In this paper, we propose FC, a novel and practical ACP that can adapt to changes in failure conditions. In essence, FC includes three dedicated protocols, which are specifically designed for three different failure conditions: (i) failure-free: no failure occurs, (ii) crash-failure: nodes might crash but there is no delayed connection, or (iii) network-failure: both crashed nodes and delayed connection can occur. During its operation, FC can monitor if any failure occurs and dynamically switch to the most suitable protocol, using a protocol selector, whose parameters are fine-tuned by reinforcement learning. Consequently, FC improves transaction performance and robustly ensures fault tolerance when crash failures and network failures occur. We conduct extensive experiments to evaluate FC with both YCSB and TPC-C benchmarks. The experimental results show that FC achieves up to 2.88x higher throughput and 3.76x lower latency than state-of-the-art ACPs, and its sustainable performance when integrated with two popular databases, namely MongoDB and PostgreSQL.
AB - Atomic commit protocols (ACPs) are crucial for ensuring transaction atomicity in distributed transaction processing. However, existing ACPs, designed specifically for fixed failure conditions, cannot work efficiently in modern environments, where failures such as node crashes and connection delays can happen anytime due to the use of commodity nodes and networks. In this paper, we propose FC, a novel and practical ACP that can adapt to changes in failure conditions. In essence, FC includes three dedicated protocols, which are specifically designed for three different failure conditions: (i) failure-free: no failure occurs, (ii) crash-failure: nodes might crash but there is no delayed connection, or (iii) network-failure: both crashed nodes and delayed connection can occur. During its operation, FC can monitor if any failure occurs and dynamically switch to the most suitable protocol, using a protocol selector, whose parameters are fine-tuned by reinforcement learning. Consequently, FC improves transaction performance and robustly ensures fault tolerance when crash failures and network failures occur. We conduct extensive experiments to evaluate FC with both YCSB and TPC-C benchmarks. The experimental results show that FC achieves up to 2.88x higher throughput and 3.76x lower latency than state-of-the-art ACPs, and its sustainable performance when integrated with two popular databases, namely MongoDB and PostgreSQL.
KW - Atomic Commit
KW - Distributed Transactions
UR - https://www.scopus.com/pages/publications/85200508220
U2 - 10.1109/ICDE60146.2024.00162
DO - 10.1109/ICDE60146.2024.00162
M3 - Conference contribution
AN - SCOPUS:85200508220
T3 - Proceedings - International Conference on Data Engineering
SP - 2026
EP - 2039
BT - Proceedings - 2024 IEEE 40th International Conference on Data Engineering, ICDE 2024
PB - IEEE Computer Society
Y2 - 13 May 2024 through 17 May 2024
ER -