TY - GEN
T1 - Performance evaluation of data-push thread on commercial CMP platform
AU - Zhang, Jianxun
AU - Gu, Zhimin
AU - Zheng, Ninghan
AU - Huang, Yan
AU - Cai, Min
AU - Yang, Sicai
AU - Zhou, Wenbiao
PY - 2010
Y1 - 2010
N2 - Helper thread is a promising prefetching technique to bridge the memory wall on contemporary CMP platform. However, the synchronization between application and helper thread is important to the performance improvement. Previous research mainly focused on the loop-count based synchronization, and it is only suitable for the main thread which has enough computation workload. As for the situation of small computation workload in main thread, this paper presents a multi-parameter helper thread prefetching model. By using memory intensive workloads, this paper gives a detailed performance evaluation of data-push(helper) thread on commercial CMP platform. As well, we evaluated the applicability of data push thread prefetching in multiple process environment. A methodology including workload selection and measurement metrics and hardware prefetcher throttle effect has been described. The evaluation results using data-push threads on em3d, mcf and mst show gains of 12%, 24%, 42% respectively when the hardware prefetcher was adjusted properly.
AB - Helper thread is a promising prefetching technique to bridge the memory wall on contemporary CMP platform. However, the synchronization between application and helper thread is important to the performance improvement. Previous research mainly focused on the loop-count based synchronization, and it is only suitable for the main thread which has enough computation workload. As for the situation of small computation workload in main thread, this paper presents a multi-parameter helper thread prefetching model. By using memory intensive workloads, this paper gives a detailed performance evaluation of data-push(helper) thread on commercial CMP platform. As well, we evaluated the applicability of data push thread prefetching in multiple process environment. A methodology including workload selection and measurement metrics and hardware prefetcher throttle effect has been described. The evaluation results using data-push threads on em3d, mcf and mst show gains of 12%, 24%, 42% respectively when the hardware prefetcher was adjusted properly.
KW - Data prefetching
KW - Pre-execution
KW - Push/helper thread
UR - http://www.scopus.com/inward/record.url?scp=77954771135&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:77954771135
SN - 9788988678206
T3 - INC2010 - The International Conference on Networked Computing, Proceeding
SP - 21
EP - 26
BT - INC2010 - The International Conference on Networked Computing, Proceeding
T2 - 6th International Conference on Networked Computing, INC2010
Y2 - 11 May 2010 through 13 May 2010
ER -