Rafiki: Machine learning as an analytics service system

Wei Wang; Jinyang Gao; Meihui Zhang; Sheng Wang; Gang Chen; Teck Khim Ng; Beng Chin Ooi; Jie Shao; Moaz Reyad

doi:10.14778/3282495.3282499

Rafiki: Machine learning as an analytics service system

Wei Wang, Jinyang Gao, Meihui Zhang, Sheng Wang, Gang Chen, Teck Khim Ng, Beng Chin Ooi, Jie Shao, Moaz Reyad

Research output: Contribution to journal › Conference article › peer-review

51 Citations (Scopus)

Abstract

Big data analytics is gaining massive momentum in the last few years. Applying machine learning models to big data has become an implicit requirement or an expectation for most analysis tasks, especially on high-stakes applications. Typical applications include sentiment analysis against reviews for analyzing on-line products, image classification in food logging applications for monitoring user's daily intake, and stock movement prediction. Extending traditional database systems to support the above analysis is intriguing but challenging. First, it is almost impossible to implement all machine learning models in the database engines. Second, expert knowledge is required to optimize the training and inference procedures in terms of efficiency and effectiveness, which imposes heavy burden on the system users. In this paper, we develop and present a system, called Rafiki, to provide the training and inference service of machine learning models. Rafiki provides distributed hyper-parameter tuning for the training service, and online ensemble modeling for the inference service which trades off between latency and accuracy. Experimental results confirm the efficiency, effectiveness, scalability and usability of Rafiki.

Original language	English
Pages (from-to)	128-140
Number of pages	13
Journal	Proceedings of the VLDB Endowment
Volume	12
Issue number	2
DOIs	https://doi.org/10.14778/3282495.3282499
Publication status	Published - 2018
Externally published	Yes
Event	45th International Conference on Very Large Data Bases, VLDB 2019 - Los Angeles, United States Duration: 26 Aug 2017 → 30 Aug 2017

Access to Document

10.14778/3282495.3282499

Cite this

@article{3c8b833f3ce14c2db33fde3a7a40026c,

title = "Rafiki: Machine learning as an analytics service system",

abstract = "Big data analytics is gaining massive momentum in the last few years. Applying machine learning models to big data has become an implicit requirement or an expectation for most analysis tasks, especially on high-stakes applications. Typical applications include sentiment analysis against reviews for analyzing on-line products, image classification in food logging applications for monitoring user's daily intake, and stock movement prediction. Extending traditional database systems to support the above analysis is intriguing but challenging. First, it is almost impossible to implement all machine learning models in the database engines. Second, expert knowledge is required to optimize the training and inference procedures in terms of efficiency and effectiveness, which imposes heavy burden on the system users. In this paper, we develop and present a system, called Rafiki, to provide the training and inference service of machine learning models. Rafiki provides distributed hyper-parameter tuning for the training service, and online ensemble modeling for the inference service which trades off between latency and accuracy. Experimental results confirm the efficiency, effectiveness, scalability and usability of Rafiki.",

author = "Wei Wang and Jinyang Gao and Meihui Zhang and Sheng Wang and Gang Chen and Ng, {Teck Khim} and Ooi, {Beng Chin} and Jie Shao and Moaz Reyad",

note = "Publisher Copyright: {\textcopyright} 2018 VLDB Endowment 21508097/18/07.; 45th International Conference on Very Large Data Bases, VLDB 2019 ; Conference date: 26-08-2017 Through 30-08-2017",

year = "2018",

doi = "10.14778/3282495.3282499",

language = "English",

volume = "12",

pages = "128--140",

journal = "Proceedings of the VLDB Endowment",

issn = "2150-8097",

publisher = "Very Large Data Base Endowment Inc.",

number = "2",

}

TY - JOUR

T1 - Rafiki

T2 - 45th International Conference on Very Large Data Bases, VLDB 2019

AU - Wang, Wei

AU - Gao, Jinyang

AU - Zhang, Meihui

AU - Wang, Sheng

AU - Chen, Gang

AU - Ng, Teck Khim

AU - Ooi, Beng Chin

AU - Shao, Jie

AU - Reyad, Moaz

PY - 2018

Y1 - 2018

N2 - Big data analytics is gaining massive momentum in the last few years. Applying machine learning models to big data has become an implicit requirement or an expectation for most analysis tasks, especially on high-stakes applications. Typical applications include sentiment analysis against reviews for analyzing on-line products, image classification in food logging applications for monitoring user's daily intake, and stock movement prediction. Extending traditional database systems to support the above analysis is intriguing but challenging. First, it is almost impossible to implement all machine learning models in the database engines. Second, expert knowledge is required to optimize the training and inference procedures in terms of efficiency and effectiveness, which imposes heavy burden on the system users. In this paper, we develop and present a system, called Rafiki, to provide the training and inference service of machine learning models. Rafiki provides distributed hyper-parameter tuning for the training service, and online ensemble modeling for the inference service which trades off between latency and accuracy. Experimental results confirm the efficiency, effectiveness, scalability and usability of Rafiki.

AB - Big data analytics is gaining massive momentum in the last few years. Applying machine learning models to big data has become an implicit requirement or an expectation for most analysis tasks, especially on high-stakes applications. Typical applications include sentiment analysis against reviews for analyzing on-line products, image classification in food logging applications for monitoring user's daily intake, and stock movement prediction. Extending traditional database systems to support the above analysis is intriguing but challenging. First, it is almost impossible to implement all machine learning models in the database engines. Second, expert knowledge is required to optimize the training and inference procedures in terms of efficiency and effectiveness, which imposes heavy burden on the system users. In this paper, we develop and present a system, called Rafiki, to provide the training and inference service of machine learning models. Rafiki provides distributed hyper-parameter tuning for the training service, and online ensemble modeling for the inference service which trades off between latency and accuracy. Experimental results confirm the efficiency, effectiveness, scalability and usability of Rafiki.

UR - http://www.scopus.com/inward/record.url?scp=85061759520&partnerID=8YFLogxK

U2 - 10.14778/3282495.3282499

DO - 10.14778/3282495.3282499

M3 - Conference article

AN - SCOPUS:85061759520

SN - 2150-8097

VL - 12

SP - 128

EP - 140

JO - Proceedings of the VLDB Endowment

JF - Proceedings of the VLDB Endowment

IS - 2

Y2 - 26 August 2017 through 30 August 2017

ER -

Rafiki: Machine learning as an analytics service system

Abstract

Access to Document

Other files and links

Fingerprint

Cite this