Efficient State Management for Scaling Out Stateful Operators in Stream Processing Systems

Muhammad Mudassar; Yanlong Zhai; Lejian Liao

doi:10.1089/big.2018.0093

Efficient State Management for Scaling Out Stateful Operators in Stream Processing Systems

Muhammad Mudassar, Yanlong Zhai^*, Lejian Liao

^*此作品的通讯作者

网络空间安全学院

Beijing Institute of Technology

科研成果: 期刊稿件 › 文章 › 同行评审

2 引用（Scopus）

摘要

Many big data applications require real-Time analysis of continuous data streams. Stream Processing Systems (SPSs) are designed to act on real-Time streaming data using continuous queries consisting of interconnected operators. The dynamic nature of data streams, for example, fluctuation in data arrival rates and uneven data distribution, can cause an operator to be a bottleneck one. Scalability is an important factor in SPS, but detecting bottleneck operator correctly and scaling it without affecting application execution are challenging. A stateful operator such as aggregation or join makes scaling operation more difficult as it involves state management. Current research does not address the issue of scaling stateful operators efficiently as mostly stop application for handling state, which results in significant overheads to the performance. In this article, the key idea is to detect bottleneck operator correctly using the runtime bottleneck detection approach and then scale out this operator and manage its internal state in a way that we can achieve almost zero latency. During the bottleneck detection process, we have defined alarming-threshold, a parameter for the operators that can be bottleneck operators in the future and scale-out-threshold, when the operator is bottleneck. To scale out, we have presented two techniques, active backup and checkpointing, the former one will start a Secondary Execution (SE) in back end by partitioning state and input streams to multiple nodes at alarming-threshold; this SE will replace primary node at scale-out-threshold. In the latter technique, a State Manager (SM) module will start state checkpointing at alarming-threshold to external store and perform scale out by managing state and input stream at scale-out-threshold. The first approach will help us to achieve almost zero latency goal, while the latter one is a resource efficient technique. Our results show that both techniques are working while providing desired goals of reducing overall latency during scale out and improving resource utilization.

源语言	英语
页（从-至）	192-206
页数	15
期刊	Big Data
卷	7
期	3
DOI	https://doi.org/10.1089/big.2018.0093
出版状态	已出版 - 9月 2019

访问文件

10.1089/big.2018.0093

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{2b411f46b098482888d9d3cb5934292d,

title = "Efficient State Management for Scaling Out Stateful Operators in Stream Processing Systems",

abstract = "Many big data applications require real-Time analysis of continuous data streams. Stream Processing Systems (SPSs) are designed to act on real-Time streaming data using continuous queries consisting of interconnected operators. The dynamic nature of data streams, for example, fluctuation in data arrival rates and uneven data distribution, can cause an operator to be a bottleneck one. Scalability is an important factor in SPS, but detecting bottleneck operator correctly and scaling it without affecting application execution are challenging. A stateful operator such as aggregation or join makes scaling operation more difficult as it involves state management. Current research does not address the issue of scaling stateful operators efficiently as mostly stop application for handling state, which results in significant overheads to the performance. In this article, the key idea is to detect bottleneck operator correctly using the runtime bottleneck detection approach and then scale out this operator and manage its internal state in a way that we can achieve almost zero latency. During the bottleneck detection process, we have defined alarming-threshold, a parameter for the operators that can be bottleneck operators in the future and scale-out-threshold, when the operator is bottleneck. To scale out, we have presented two techniques, active backup and checkpointing, the former one will start a Secondary Execution (SE) in back end by partitioning state and input streams to multiple nodes at alarming-threshold; this SE will replace primary node at scale-out-threshold. In the latter technique, a State Manager (SM) module will start state checkpointing at alarming-threshold to external store and perform scale out by managing state and input stream at scale-out-threshold. The first approach will help us to achieve almost zero latency goal, while the latter one is a resource efficient technique. Our results show that both techniques are working while providing desired goals of reducing overall latency during scale out and improving resource utilization.",

keywords = "big data, scale out, state management, stream processing systems",

author = "Muhammad Mudassar and Yanlong Zhai and Lejian Liao",

year = "2019",

month = sep,

doi = "10.1089/big.2018.0093",

language = "English",

volume = "7",

pages = "192--206",

journal = "Big Data",

issn = "2167-6461",

publisher = "Mary Ann Liebert Inc.",

number = "3",

}

TY - JOUR

T1 - Efficient State Management for Scaling Out Stateful Operators in Stream Processing Systems

AU - Mudassar, Muhammad

AU - Zhai, Yanlong

AU - Liao, Lejian

PY - 2019/9

Y1 - 2019/9

N2 - Many big data applications require real-Time analysis of continuous data streams. Stream Processing Systems (SPSs) are designed to act on real-Time streaming data using continuous queries consisting of interconnected operators. The dynamic nature of data streams, for example, fluctuation in data arrival rates and uneven data distribution, can cause an operator to be a bottleneck one. Scalability is an important factor in SPS, but detecting bottleneck operator correctly and scaling it without affecting application execution are challenging. A stateful operator such as aggregation or join makes scaling operation more difficult as it involves state management. Current research does not address the issue of scaling stateful operators efficiently as mostly stop application for handling state, which results in significant overheads to the performance. In this article, the key idea is to detect bottleneck operator correctly using the runtime bottleneck detection approach and then scale out this operator and manage its internal state in a way that we can achieve almost zero latency. During the bottleneck detection process, we have defined alarming-threshold, a parameter for the operators that can be bottleneck operators in the future and scale-out-threshold, when the operator is bottleneck. To scale out, we have presented two techniques, active backup and checkpointing, the former one will start a Secondary Execution (SE) in back end by partitioning state and input streams to multiple nodes at alarming-threshold; this SE will replace primary node at scale-out-threshold. In the latter technique, a State Manager (SM) module will start state checkpointing at alarming-threshold to external store and perform scale out by managing state and input stream at scale-out-threshold. The first approach will help us to achieve almost zero latency goal, while the latter one is a resource efficient technique. Our results show that both techniques are working while providing desired goals of reducing overall latency during scale out and improving resource utilization.

AB - Many big data applications require real-Time analysis of continuous data streams. Stream Processing Systems (SPSs) are designed to act on real-Time streaming data using continuous queries consisting of interconnected operators. The dynamic nature of data streams, for example, fluctuation in data arrival rates and uneven data distribution, can cause an operator to be a bottleneck one. Scalability is an important factor in SPS, but detecting bottleneck operator correctly and scaling it without affecting application execution are challenging. A stateful operator such as aggregation or join makes scaling operation more difficult as it involves state management. Current research does not address the issue of scaling stateful operators efficiently as mostly stop application for handling state, which results in significant overheads to the performance. In this article, the key idea is to detect bottleneck operator correctly using the runtime bottleneck detection approach and then scale out this operator and manage its internal state in a way that we can achieve almost zero latency. During the bottleneck detection process, we have defined alarming-threshold, a parameter for the operators that can be bottleneck operators in the future and scale-out-threshold, when the operator is bottleneck. To scale out, we have presented two techniques, active backup and checkpointing, the former one will start a Secondary Execution (SE) in back end by partitioning state and input streams to multiple nodes at alarming-threshold; this SE will replace primary node at scale-out-threshold. In the latter technique, a State Manager (SM) module will start state checkpointing at alarming-threshold to external store and perform scale out by managing state and input stream at scale-out-threshold. The first approach will help us to achieve almost zero latency goal, while the latter one is a resource efficient technique. Our results show that both techniques are working while providing desired goals of reducing overall latency during scale out and improving resource utilization.

KW - big data

KW - scale out

KW - state management

KW - stream processing systems

UR - http://www.scopus.com/inward/record.url?scp=85072266808&partnerID=8YFLogxK

U2 - 10.1089/big.2018.0093

DO - 10.1089/big.2018.0093

M3 - Article

C2 - 30994383

AN - SCOPUS:85072266808

SN - 2167-6461

VL - 7

SP - 192

EP - 206

JO - Big Data

JF - Big Data

IS - 3

ER -

Efficient State Management for Scaling Out Stateful Operators in Stream Processing Systems

摘要

访问文件

其它文件与链接

指纹

引用此