dc.contributor.author |
Drosinos, N |
en |
dc.contributor.author |
Koziris, N |
en |
dc.date.accessioned |
2014-03-01T02:42:54Z |
|
dc.date.available |
2014-03-01T02:42:54Z |
|
dc.date.issued |
2004 |
en |
dc.identifier.uri |
https://dspace.lib.ntua.gr/xmlui/handle/123456789/31137 |
|
dc.subject |
Experimental Evaluation |
en |
dc.subject |
Hybrid Model |
en |
dc.subject |
Message Passing |
en |
dc.subject |
Nested Loops |
en |
dc.subject |
Parallel Models |
en |
dc.subject |
Performance Comparison |
en |
dc.subject |
Shared Memory |
en |
dc.subject |
Smp Cluster |
en |
dc.subject.other |
Bandwidth |
en |
dc.subject.other |
Codes (symbols) |
en |
dc.subject.other |
Computer architecture |
en |
dc.subject.other |
Data communication systems |
en |
dc.subject.other |
Mathematical models |
en |
dc.subject.other |
Parallel algorithms |
en |
dc.subject.other |
Synchronization |
en |
dc.subject.other |
Bandwidth measurements |
en |
dc.subject.other |
Flow data dependencies |
en |
dc.subject.other |
Intra-node messages |
en |
dc.subject.other |
Parallelization |
en |
dc.subject.other |
Parallel processing systems |
en |
dc.title |
Performance comparison of pure MPI vs hybrid MPI-OpenMP parallelization models on SMP clusters |
en |
heal.type |
conferenceItem |
en |
heal.identifier.primary |
10.1109/IPDPS.2004.1302919 |
en |
heal.identifier.secondary |
http://dx.doi.org/10.1109/IPDPS.2004.1302919 |
en |
heal.publicationDate |
2004 |
en |
heal.abstract |
This paper compares the performance of three programming paradigms for the parallelization of nested loop algorithms onto BMP clusters. More specifically, we propose three alternative models for tiled nested loop algorithms, namely a pure message passing paradigm, as well as two hybrid ones, that implement communication both through message passing and shared memory access. The hybrid models adopt an advanced hyperplane scheduling scheme, that allows both for minimal thread synchronization, as well as for pipelined execution with overlapping of computation and communication phases. We focus on the experimental evaluation of all three models, and test their performance against several iteration spaces and parallelization grains with the aid of a typical micro-kernel benchmark. We conclude that the hybrid models can in some cases be more beneficial compared to the monolithic pure message passing model, as they exploit better the configuration characteristics of an hierarchical parallel platform, such as an SMP cluster. |
en |
heal.journalName |
Proceedings - International Parallel and Distributed Processing Symposium, IPDPS 2004 (Abstracts and CD-ROM) |
en |
dc.identifier.doi |
10.1109/IPDPS.2004.1302919 |
en |
dc.identifier.volume |
18 |
en |
dc.identifier.spage |
193 |
en |
dc.identifier.epage |
202 |
en |