TY - GEN
T1 - Design and optimization of a big data computing framework based on CPU/GPU cluster
AU - Zhai, Yanlong
AU - Guo, Ying
AU - Chen, Qiurui
AU - Yang, Kai
AU - Mbarushimana, Emmanuel
PY - 2014
Y1 - 2014
N2 - Big data processing is receiving significant amount of interest as an important technology to reveal the information behind the data, such as trends, characteristics, etc. MapReduce is one of the most popular distributed parallel data processing framework. However, some high-end applications, especially some scientific analyses have both data-intensive and computation intensive features. Therefore, we have designed and implemented a high performance big data process framework called Lit, which leverages the power of Hadoop and GPUs. In this paper, we presented the basic design and architecture of Lit. More importantly, we spent a lot of effort on optimizing the communications between CPU and GPU. Lit integrated GPU with Hadoop to improve the computational power of each node in the cluster. To simplify the parallel programming, Lit provided an annotation based approach to automatically generate CUDA codes from Hadoop codes. Lit hid the complexity of programming on CPU/GPU cluster by providing extended compiler and optimizer. To utilize the simplified programming, scalability and fault tolerance benefits of Hadoop and combine them with the high performance computation power of GPU, Lit extended the Hadoop by applying a GPUClassloader to detect the GPU, generate and compile CUDA codes, and invoke the shared library. For all CPU-GPU co-processing systems, the communication with the GPU is the well-known performance bottleneck. We introduced data flow optimization approach to reduce unnecessary memory copies. Our experimental results show that Lit can achieve an average speedup of 1x to 3x on three typical applications over Hadoop, and the data flow optimization approach for the Lit can achieve about 16% performance gain.
AB - Big data processing is receiving significant amount of interest as an important technology to reveal the information behind the data, such as trends, characteristics, etc. MapReduce is one of the most popular distributed parallel data processing framework. However, some high-end applications, especially some scientific analyses have both data-intensive and computation intensive features. Therefore, we have designed and implemented a high performance big data process framework called Lit, which leverages the power of Hadoop and GPUs. In this paper, we presented the basic design and architecture of Lit. More importantly, we spent a lot of effort on optimizing the communications between CPU and GPU. Lit integrated GPU with Hadoop to improve the computational power of each node in the cluster. To simplify the parallel programming, Lit provided an annotation based approach to automatically generate CUDA codes from Hadoop codes. Lit hid the complexity of programming on CPU/GPU cluster by providing extended compiler and optimizer. To utilize the simplified programming, scalability and fault tolerance benefits of Hadoop and combine them with the high performance computation power of GPU, Lit extended the Hadoop by applying a GPUClassloader to detect the GPU, generate and compile CUDA codes, and invoke the shared library. For all CPU-GPU co-processing systems, the communication with the GPU is the well-known performance bottleneck. We introduced data flow optimization approach to reduce unnecessary memory copies. Our experimental results show that Lit can achieve an average speedup of 1x to 3x on three typical applications over Hadoop, and the data flow optimization approach for the Lit can achieve about 16% performance gain.
UR - http://www.scopus.com/inward/record.url?scp=84903946827&partnerID=8YFLogxK
U2 - 10.1109/HPCC.and.EUC.2013.147
DO - 10.1109/HPCC.and.EUC.2013.147
M3 - Conference contribution
AN - SCOPUS:84903946827
SN - 9780769550886
T3 - Proceedings - 2013 IEEE International Conference on High Performance Computing and Communications, HPCC 2013 and 2013 IEEE International Conference on Embedded and Ubiquitous Computing, EUC 2013
SP - 1039
EP - 1046
BT - Proceedings - 2013 IEEE International Conference on High Performance Computing and Communications, HPCC 2013 and 2013 IEEE International Conference on Embedded and Ubiquitous Computing, EUC 2013
PB - IEEE Computer Society
T2 - 15th IEEE International Conference on High Performance Computing and Communications, HPCC 2013 and 11th IEEE/IFIP International Conference on Embedded and Ubiquitous Computing, EUC 2013
Y2 - 13 November 2013 through 15 November 2013
ER -