Abstract
First, based on the parallel computing model HPM, the characteristics of the architecture of the SMPS cluster are investigated. Focusing on the two aspects: the parallelism and the locality (storage and communication) of the architecture of CoSMPs, the main factors that influence the performance of the parallel applications are analyzed, and the problems of how to parallelize and optimize applications are investigated. Both the merits and the demerits of the two programming modes: the MPI mode and the MPI+SMP (OMP) directive mode are investigated. Then, some techniques about how to parallelize and optimize applications on CoSMPs are investigated in detail. Finally, the performance of the two communicational modes (the loop-exchange mode and the border exchange mode) on CoSMPs is investigated on an instance of the cluster of SMPs - the Dawning 3000 super computer. Via two examples - the matrix multiply algorithm for the loop-exchange and the five-point algorithm for the border-exchange mode - the methods are tested and the results are consistent with the theoretical conclusion.
Original language | English |
---|---|
Pages (from-to) | 621-629 |
Number of pages | 9 |
Journal | Jisuanji Yanjiu yu Fazhan/Computer Research and Development |
Volume | 41 |
Issue number | 4 |
Publication status | Published - Apr 2004 |
Externally published | Yes |
Keywords
- CoSMPs
- HPM
- Memory-hierarchy
- SMP