HEAL DSpace

Exploring the performance limits of simultaneous multithreading for memory intensive applications

Αποθετήριο DSpace/Manakin

Εμφάνιση απλής εγγραφής

dc.contributor.author Athanasaki, E en
dc.contributor.author Anastopoulos, N en
dc.contributor.author Kourtis, K en
dc.contributor.author Koziris, N en
dc.date.accessioned 2014-03-01T01:28:23Z
dc.date.available 2014-03-01T01:28:23Z
dc.date.issued 2008 en
dc.identifier.issn 0920-8542 en
dc.identifier.uri https://dspace.lib.ntua.gr/xmlui/handle/123456789/18832
dc.subject Instruction-level parallelism en
dc.subject Performance analysis en
dc.subject Simultaneous multithreading en
dc.subject Software prefetching en
dc.subject Speculative precomputation en
dc.subject Thread-level parallelism en
dc.subject.classification Computer Science, Hardware & Architecture en
dc.subject.classification Computer Science, Theory & Methods en
dc.subject.classification Engineering, Electrical & Electronic en
dc.subject.other Data storage equipment en
dc.subject.other Parallel programming en
dc.subject.other Resource allocation en
dc.subject.other Synchronization en
dc.subject.other Instruction-level parallelismILP) en
dc.subject.other Simultaneous multithreading (SMT) en
dc.subject.other Speculative precomputation en
dc.subject.other Thread-level parallelism en
dc.subject.other Multiprocessing systems en
dc.title Exploring the performance limits of simultaneous multithreading for memory intensive applications en
heal.type journalArticle en
heal.identifier.primary 10.1007/s11227-007-0149-x en
heal.identifier.secondary http://dx.doi.org/10.1007/s11227-007-0149-x en
heal.language English en
heal.publicationDate 2008 en
heal.abstract Simultaneous multithreading (SMT) has been proposed to improve system throughput by overlapping instructions from multiple threads on a single wide-issue processor. Recent studies have demonstrated that diversity of simultaneously executed applications can bring up significant performance gains due to SMT. However, the speedup of a single application that is parallelized into multiple threads, is often sensitive to its inherent instruction level parallelism (ILP), as well as the efficiency of synchronization and communication mechanisms between its separate, but possibly dependent threads. Moreover, as these separate threads tend to put pressure on the same architectural resources, no significant speedup can be observed. In this paper, we evaluate and contrast thread-level parallelism (TLP) and speculative precomputation (SPR) techniques for a series of memory intensive codes executed on a specific SMT processor implementation. We explore the performance limits by evaluating the tradeoffs between ILP and TLP for various kinds of instruction streams. By obtaining knowledge on how such streams interact when executed simultaneously on the processor, and quantifying their presence within each application's threads, we try to interpret the observed performance for each application when parallelized according to the aforementioned techniques. In order to amplify this evaluation process, we also present results gathered from the performance monitoring hardware of the processor. © 2007 Springer Science+Business Media, LLC. en
heal.publisher SPRINGER en
heal.journalName Journal of Supercomputing en
dc.identifier.doi 10.1007/s11227-007-0149-x en
dc.identifier.isi ISI:000253523500004 en
dc.identifier.volume 44 en
dc.identifier.issue 1 en
dc.identifier.spage 64 en
dc.identifier.epage 97 en


Αρχεία σε αυτό το τεκμήριο

Αρχεία Μέγεθος Μορφότυπο Προβολή

Δεν υπάρχουν αρχεία που σχετίζονται με αυτό το τεκμήριο.

Αυτό το τεκμήριο εμφανίζεται στην ακόλουθη συλλογή(ές)

Εμφάνιση απλής εγγραφής