Automatic code generation for executing tiled nested loops onto parallel architectures

Goumas, G; Athanasaki, M; Koziris, N

dc.contributor.author	Goumas, G	en
dc.contributor.author	Athanasaki, M	en
dc.contributor.author	Koziris, N	en
dc.date.accessioned	2014-03-01T02:42:04Z
dc.date.available	2014-03-01T02:42:04Z
dc.date.issued	2002	en
dc.identifier.uri	https://dspace.lib.ntua.gr/xmlui/handle/123456789/30763
dc.subject	Code generation	en
dc.subject	Fourier-Motzkin elimination	en
dc.subject	Hermite normal forms	en
dc.subject	Loop tiling	en
dc.subject	Non-unimodular transformations	en
dc.subject.other	Computational complexity	en
dc.subject.other	Data storage equipment	en
dc.subject.other	Hierarchical systems	en
dc.subject.other	Iterative methods	en
dc.subject.other	Mathematical transformations	en
dc.subject.other	Parallel processing systems	en
dc.subject.other	Automatic code generation	en
dc.subject.other	Fourier-Motzkin elimination	en
dc.subject.other	Hermite normal forms	en
dc.subject.other	Memory access architectures	en
dc.subject.other	Multilevel memory	en
dc.subject.other	Distributed computer systems	en
dc.title	Automatic code generation for executing tiled nested loops onto parallel architectures	en
heal.type	conferenceItem	en
heal.identifier.primary	10.1145/508791.508961	en
heal.identifier.secondary	http://dx.doi.org/10.1145/508791.508961	en
heal.publicationDate	2002	en
heal.abstract	This paper presents a novel approach for the problem of generating tiled code for nested for-loops using a tiling transformation. Tiling or supernode transformation has been widely used to improve locality in multi-level memory hierarchies as well as to efficiently execute loops onto non-uniform memory access architectures. However, automatic code generation for tiled loops can be a very complex compiler work due to non-rectangular tile shapes and iteration space bounds. Our method considerably enhances previous work on rewriting tiled loops by considering parallelepiped tiles and arbitrary iteration space shapes. The complexity of code generation for tiling transformation is now reduced to the complexity of code generation for any linear transformation. Experimental results which compare all so far presented approaches, show that the proposed approach for generating tiled code is significantly accelerated.	en
heal.journalName	Proceedings of the ACM Symposium on Applied Computing	en
dc.identifier.doi	10.1145/508791.508961	en
dc.identifier.spage	876	en
dc.identifier.epage	881	en