Towards the optimal synchronization granularity for dynamic scheduling of pipelined computations on heterogeneous computing systems

DSpace/Manakin Repository

Show simple item record

dc.contributor.author Riakiotakis, I en
dc.contributor.author Ciorba, FM en
dc.contributor.author Andronikos, T en
dc.contributor.author Papakonstantinou, G en
dc.contributor.author Chronopoulos, AT en
dc.date.accessioned 2014-03-01T11:47:19Z
dc.date.available 2014-03-01T11:47:19Z
dc.date.issued 2012 en
dc.identifier.issn 15320626 en
dc.identifier.uri http://hdl.handle.net/123456789/38126
dc.subject Communication model en
dc.subject Dynamic load balancing en
dc.subject Heterogeneous systems en
dc.subject Inter-processor communication en
dc.subject Loops with data dependencies en
dc.subject Performance evaluation en
dc.subject Performance prediction en
dc.subject Pipelined computations en
dc.subject Synchronization en
dc.title Towards the optimal synchronization granularity for dynamic scheduling of pipelined computations on heterogeneous computing systems en
heal.type other en
heal.identifier.primary 10.1002/cpe.2812 en
heal.identifier.secondary http://dx.doi.org/10.1002/cpe.2812 en
heal.publicationDate 2012 en
heal.abstract Loops are the richest source of parallelism in scientific applications. A large number of loop scheduling schemes have therefore been devised for loops with and without data dependencies (modeled as dependence distance vectors) on heterogeneous clusters. The loops with data dependencies require synchronization via cross-node communication. Synchronization requires fine-tuning to overcome the communication overhead and to yield the best possible overall performance. In this paper, a theoretical model is presented to determine the granularity of synchronization that minimizes the parallel execution time of loops with data dependencies when these are parallelized on heterogeneous systems using dynamic self-scheduling algorithms. New formulas are proposed for estimating the total number of scheduling steps when a threshold for the minimum work assigned to a processor is assumed. The proposed model uses these formulas to determine the synchronization granularity that minimizes the estimated parallel execution time. The accuracy of the proposed model is verified and validated via extensive experiments on a heterogeneous computing system. The results show that the theoretically optimal synchronization granularity, as determined by the proposed model, is very close to the experimentally observed optimal synchronization granularity, with no deviation in the best case, and within 38.4% in the worst case. © 2012 John Wiley & Sons, Ltd. en
heal.journalName Concurrency Computation Practice and Experience en
dc.identifier.doi 10.1002/cpe.2812 en

Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record