A compilation and run-time framework for maximizing performance of self-scheduling algorithms

Yizhuo Wang, Laleh Aghababaie Beni, Alexandru Nicolau, Alexander V. Veidenbaum, Rosario Cammarota

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Ordinary programs contain many parallel loops which account for a significant portion of these programs' completion time. The parallel executions of such loops can significantly speedup performance of modern multi-core systems. We propose a new framework - Locality Aware Self-scheduling (LASS) - for scheduling parallel loops to multi-core systems and boost up performance of known self-scheduling algorithms in diverse execution conditions. LASS enforces data locality, by forcing the execution of consecutive chunks of iterations to the same core, and favours load balancing with the introduction of a work-stealing mechanism. LASS is evaluated on a set of kernels on a multi-core system with 16 cores. Two execution scenarios are considered. In the first scenario our application runs alone on top of the operating system. In the second scenario our application runs in conjunction with an interfering parallel job. The average speedup achieved by LASS for first execution scenario is 11% and for the second one is 31%.

源语言英语
主期刊名Network and Parallel Computing - 11th IFIP WG 10.3 International Conference, NPC 2014, Proceedings
出版商Springer Verlag
459-470
页数12
ISBN(印刷版)9783662449165
DOI
出版状态已出版 - 2014
活动11th IFIP WG 10.3 International Conference on Network and Parallel Computing, NPC 2014 - Ilan, 中国台湾
期限: 18 9月 201420 9月 2014

出版系列

姓名Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
8707 LNCS
ISSN(印刷版)0302-9743
ISSN(电子版)1611-3349

会议

会议11th IFIP WG 10.3 International Conference on Network and Parallel Computing, NPC 2014
国家/地区中国台湾
Ilan
时期18/09/1420/09/14

指纹

探究 'A compilation and run-time framework for maximizing performance of self-scheduling algorithms' 的科研主题。它们共同构成独一无二的指纹。

引用此

Wang, Y., Beni, L. A., Nicolau, A., Veidenbaum, A. V., & Cammarota, R. (2014). A compilation and run-time framework for maximizing performance of self-scheduling algorithms. 在 Network and Parallel Computing - 11th IFIP WG 10.3 International Conference, NPC 2014, Proceedings (页码 459-470). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 卷 8707 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-662-44917-2_38