ComMapReduce: An improvement of mapreduce with lightweight communication mechanisms

Linlin Ding*, Junchang Xin, Guoren Wang, Shan Huang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

10 Citations (Scopus)

Abstract

As a parallel programming model, MapReduce processes scalable and parallel applications with huge amounts of data on large clusters. In MapReduce framework, there are no communication mechanisms among Mappers, neither are among Reducers. When the amount of final results is much smaller than the original data, it is a waste of time processing the unpromising intermediate data objects. We observe that this waste can be avoided by simple communication mechanisms. In this paper, we propose ComMapReduce, a framework that extends and improves MapReduce for efficient query processing of massive data in the cloud. With efficient lightweight communication mechanisms, ComMapReduce can effectively filter the unpromising intermediate data objects in Map phase so as to decrease the input of Reduce phase specifically. Three communication strategies, Lazy, Eager and Hybrid, are proposed to filter the unpromising intermediate results of Map phase. In addition, two optimization strategies, Prepositive and Postpositive, are presented to enhance the performance of query processing by filtering more candidate data objects. Our extensive experiments on different synthetic datasets demonstrate that ComMapReduce framework outperforms the original MapReduce framework in all metrics without affecting its existing characteristics.

Original languageEnglish
Title of host publicationDatabase Systems for Advanced Applications - 17th International Conference, DASFAA 2012, Proceedings
Pages150-168
Number of pages19
EditionPART 2
DOIs
Publication statusPublished - 2012
Externally publishedYes
Event17th International Conference on Database Systems for Advanced Applications, DASFAA 2012 - Busan, Korea, Republic of
Duration: 15 Apr 201218 Apr 2012

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 2
Volume7239 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference17th International Conference on Database Systems for Advanced Applications, DASFAA 2012
Country/TerritoryKorea, Republic of
CityBusan
Period15/04/1218/04/12

Fingerprint

Dive into the research topics of 'ComMapReduce: An improvement of mapreduce with lightweight communication mechanisms'. Together they form a unique fingerprint.

Cite this