HEAL DSpace

Exploring the performance limits of simultaneous multithreading for scientific codes

Αποθετήριο DSpace/Manakin

Εμφάνιση απλής εγγραφής

dc.contributor.author Athanasaki, E en
dc.contributor.author Anastopoulos, N en
dc.contributor.author Kourtis, K en
dc.contributor.author Koziris, N en
dc.date.accessioned 2014-03-01T02:44:03Z
dc.date.available 2014-03-01T02:44:03Z
dc.date.issued 2006 en
dc.identifier.issn 01903918 en
dc.identifier.uri https://dspace.lib.ntua.gr/xmlui/handle/123456789/31639
dc.subject Instruction Level Parallel en
dc.subject Perforation en
dc.subject simultaneous multithreading en
dc.subject Thread Level Parallelism en
dc.subject.other Performance monitoring hardware en
dc.subject.other Simultaneous multithreading (SMT) en
dc.subject.other Thread-level parallelism (TLP) techniques en
dc.subject.other Interfaces (computer) en
dc.subject.other Natural sciences computing en
dc.subject.other Program processors en
dc.subject.other Synchronization en
dc.subject.other Throughput en
dc.subject.other Parallel processing systems en
dc.title Exploring the performance limits of simultaneous multithreading for scientific codes en
heal.type conferenceItem en
heal.identifier.primary 10.1109/ICPP.2006.41 en
heal.identifier.secondary http://dx.doi.org/10.1109/ICPP.2006.41 en
heal.identifier.secondary 1690604 en
heal.publicationDate 2006 en
heal.abstract Simultaneous multithreading (SMT) has been proposed to improve system throughput by overlapping instructions from multiple threads on a single wide-issue processor. The speedup of a single application that is parallelized into multiple threads, is often sensitive to its inherent instruction level parallelism (ILP), as well as the efficiency of synchronization and communication mechanisms between its separate, but possibly dependent, threads. In this paper, we evaluate and contrast software prefetching and thread-level parallelism (TLP) techniques for a series of scientific codes executed on an SMT processor. We explore the performance limits by evaluating the tradeoffs between ILP and TLP for various kinds of instructions streams. Obtaining knowledge on how such streams interact when executed simultaneously on the processor, and quantifying their presence within each application 's threads, we try to interpret the observed performance for each application when parallelized according to the aforementioned techniques. In order to amplify this evaluation process, we also present results gathered from the performance monitoring hardware of the processor. © 2006 IEEE. en
heal.journalName Proceedings of the International Conference on Parallel Processing en
dc.identifier.doi 10.1109/ICPP.2006.41 en
dc.identifier.spage 45 en
dc.identifier.epage 54 en


Αρχεία σε αυτό το τεκμήριο

Αρχεία Μέγεθος Μορφότυπο Προβολή

Δεν υπάρχουν αρχεία που σχετίζονται με αυτό το τεκμήριο.

Αυτό το τεκμήριο εμφανίζεται στην ακόλουθη συλλογή(ές)

Εμφάνιση απλής εγγραφής