HEAL DSpace

A predecoding technique for ILP exploitation in Java processors

Αποθετήριο DSpace/Manakin

Εμφάνιση απλής εγγραφής

dc.contributor.author Sideris, I en
dc.contributor.author Pekmestzi, K en
dc.contributor.author Economakos, G en
dc.date.accessioned 2014-03-01T01:27:47Z
dc.date.available 2014-03-01T01:27:47Z
dc.date.issued 2008 en
dc.identifier.issn 1383-7621 en
dc.identifier.uri https://dspace.lib.ntua.gr/xmlui/handle/123456789/18572
dc.subject ILP en
dc.subject Java processor en
dc.subject Predecoded cache en
dc.subject Stack folding en
dc.subject.classification Computer Science, Hardware & Architecture en
dc.subject.other Computer programming languages en
dc.subject.other Decoding en
dc.subject.other Mathematical transformations en
dc.subject.other Program processors en
dc.subject.other Reduced instruction set computing en
dc.subject.other Throughput en
dc.subject.other Elsevier (CO) en
dc.subject.other Execution performance en
dc.subject.other hardware accelerations en
dc.subject.other In order en
dc.subject.other Instruction-level parallelism (ILP) en
dc.subject.other JAVA applications en
dc.subject.other Java bytecodes en
dc.subject.other Java processors en
dc.subject.other JAVA programs en
dc.subject.other Java Virtual Machine (JVM) en
dc.subject.other Out of order en
dc.subject.other Out-of-order execution en
dc.subject.other Predecoded cache en
dc.subject.other Stack folding algorithms en
dc.subject.other super scalar en
dc.subject.other Java programming language en
dc.title A predecoding technique for ILP exploitation in Java processors en
heal.type journalArticle en
heal.identifier.primary 10.1016/j.sysarc.2008.01.008 en
heal.identifier.secondary http://dx.doi.org/10.1016/j.sysarc.2008.01.008 en
heal.language English en
heal.publicationDate 2008 en
heal.abstract Java processors have been introduced to offer hardware acceleration for Java applications. They execute Java bytecodes directly in hardware. However, the stack nature of the Java virtual machine instruction set imposes a limitation on the achievable execution performance. In order to exploit instruction level parallelism and allow out of order execution, we must remove the stack completely. This can be achieved by recursive stack folding algorithms, such as OPEX, which dynamically transform groups of Java bytecodes to RISC like instructions. However, the decoding throughputs that are obtained are limited. In this paper, we explore microarchitectural techniques to improve the decoding throughput of Java processors. Our techniques are based on the use of a predecoded cache to store the folding results, so that it could be reused. The ultimate goal is to exploit every possible instruction level parallelism in Java programs by having a superscalar out of order core in the backend being fed at a sustainable rate. With the use of a predecoded cache of 2 x 2048 entries and a 4-way superscalar core we have from 4.8 to 18.3 times better performance than an architecture employing pattern based folding. (c) 2008 Elsevier B.V. All rights reserved. en
heal.publisher ELSEVIER SCIENCE BV en
heal.journalName Journal of Systems Architecture en
dc.identifier.doi 10.1016/j.sysarc.2008.01.008 en
dc.identifier.isi ISI:000259160500007 en
dc.identifier.volume 54 en
dc.identifier.issue 7 en
dc.identifier.spage 707 en
dc.identifier.epage 728 en


Αρχεία σε αυτό το τεκμήριο

Αρχεία Μέγεθος Μορφότυπο Προβολή

Δεν υπάρχουν αρχεία που σχετίζονται με αυτό το τεκμήριο.

Αυτό το τεκμήριο εμφανίζεται στην ακόλουθη συλλογή(ές)

Εμφάνιση απλής εγγραφής