dc.contributor.author |
Goumas, G |
en |
dc.contributor.author |
Drosinos, N |
en |
dc.contributor.author |
Athanasaki, M |
en |
dc.contributor.author |
Koziris, N |
en |
dc.date.accessioned |
2014-03-01T02:42:30Z |
|
dc.date.available |
2014-03-01T02:42:30Z |
|
dc.date.issued |
2004 |
en |
dc.identifier.uri |
https://dspace.lib.ntua.gr/xmlui/handle/123456789/31025 |
|
dc.subject |
Automatic SPMD code generation |
en |
dc.subject |
MPI |
en |
dc.subject |
Nested loops |
en |
dc.subject |
Parallelizing compilers |
en |
dc.subject |
Supernodes |
en |
dc.subject |
Tiling |
en |
dc.subject.other |
Automatic SPMD code generation |
en |
dc.subject.other |
MPI |
en |
dc.subject.other |
Nested loops |
en |
dc.subject.other |
Parallelizing compilers |
en |
dc.subject.other |
Supernode |
en |
dc.subject.other |
Tiling |
en |
dc.subject.other |
Algorithms |
en |
dc.subject.other |
Codes (symbols) |
en |
dc.subject.other |
Computer programming |
en |
dc.subject.other |
Mathematical transformations |
en |
dc.subject.other |
Problem solving |
en |
dc.subject.other |
Program compilers |
en |
dc.subject.other |
Scheduling |
en |
dc.subject.other |
Software engineering |
en |
dc.subject.other |
Computer science |
en |
dc.title |
Automatic parallel code generation for tiled nested loops |
en |
heal.type |
conferenceItem |
en |
heal.identifier.primary |
10.1145/967900.968184 |
en |
heal.identifier.secondary |
http://dx.doi.org/10.1145/967900.968184 |
en |
heal.publicationDate |
2004 |
en |
heal.abstract |
This paper presents an overview of our work, concerning a complete end-to-end framework for automatically generating message passing parallel code for tiled nested for-loops. It considers general parallelepiped tiling transformations and general convex iteration spaces. We address all problems regarding both the generation of sequential tiled code and its parallelization. We have implemented our techniques in a tool which automatically generates MPI parallel code and conducted several series of experiments, concerning the compilation time of our tool, the efficiency of the generated code and the speedup attained on a cluster of PCs. Apart from confirming the value of our techniques, our experimental results show the merit of general parallelepiped tiling transformations and verify previous theoretical work on scheduling-optimal tile shapes. |
en |
heal.journalName |
Proceedings of the ACM Symposium on Applied Computing |
en |
dc.identifier.doi |
10.1145/967900.968184 |
en |
dc.identifier.volume |
2 |
en |
dc.identifier.spage |
1412 |
en |
dc.identifier.epage |
1419 |
en |