RDBMS Based Hadoop Metadata and Log Data Management Optimization

Haiying Che, Octave Iradukunda, Khalilov Shahin

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

At the moment, metadata is one of the fastest growing sub-segments of enterprise data management. While metadata is growing, it is not able to keep pace with the rapid increase of Big Data projects being currently initiated by organizations. Nowadays, it refers to this as the 'Big Data Gap'. This paper introduces novel approach by bringing Apache Hadoop and Relational database together to minimize the query time, resource usage, and increase the fault tolerance, and efficiency. Hadoop's metadata and log files are synchronously being migrated to the PGSQL and easily controlled through the graphical user interface. The experiment part has used 100.000's of movie rates dataset and decreased the resource usage of NameNode by giving the task of log and metadata analysis to the PGSQL. The query time in PGSQL is 1.5 times faster than Hadoop and the data format is in structured format comparing to Hadoop. Although, the technique implemented on a single node, it outperformed existing hadoop on premise and on cloud. The technique makes the metadata and log data management easier through the GUI that uses charts and graphs. The results suggest that the proposed approach performs better than existing solution and sharply decreases the usage of Big Data hardware systems and budget as well.

源语言英语
主期刊名ICSIDP 2019 - IEEE International Conference on Signal, Information and Data Processing 2019
出版商Institute of Electrical and Electronics Engineers Inc.
ISBN(电子版)9781728123455
DOI
出版状态已出版 - 12月 2019
活动2019 IEEE International Conference on Signal, Information and Data Processing, ICSIDP 2019 - Chongqing, 中国
期限: 11 12月 201913 12月 2019

出版系列

姓名ICSIDP 2019 - IEEE International Conference on Signal, Information and Data Processing 2019

会议

会议2019 IEEE International Conference on Signal, Information and Data Processing, ICSIDP 2019
国家/地区中国
Chongqing
时期11/12/1913/12/19

指纹

探究 'RDBMS Based Hadoop Metadata and Log Data Management Optimization' 的科研主题。它们共同构成独一无二的指纹。

引用此