Abstract

In this dissertation, we describe the DELFT-JAVA engine - a 32-bit RISC-based architecture that provides high performance JAVA program execution. More specifically we describe a microarchitecture that accelerates JAVA execution and provide details of the DELFT-JAVA architecture for executing JAVA Virtual Machine bytecode. The basic architecture implements a Media Processor with Signal Processing capabilities. The perspective of the approach is that to maximally accelerate a compiled application, the machine language should accurately reflect the type of operations the compiler specifies. Except where JAVA Virtual Machine operations are unusually complex, we prefer to allow the compiler to optimize directly to the implementation. This is independent of any particular machine organization. The architecture is then a superset of the JAVA Virtual Machine and provides operations that are necessary for system execution (e.g., I/O, supervision, etc.). Rather than just supporting the JAVA Virtual Machine, the architecture takes a more general purpose approach in that it also is intended to be programmed from a number of additional high-level languages including C and C++. Furthermore, we introduce the concept of JAVA dynamic instruction translation, a new approach to JAVA hardware acceleration. In hardware assisted dynamic translation, JAVA Virtual Machine instructions are translated on-the-fly into the DELFT-JAVA instruction set. The hardware requirements to perform this translation are not excessive. Consequently, support for JAVA language constructs are also incorporated into the processor's Instruction Set Architecture. This technique allows application level parallelism inherent in the JAVA language to be efficiently utilized as instruction level parallelism. In addition to dynamic translation, a special Link Translation Buffer (LTB) can be used to improve the performance of dynamic linking. In addition, there are some key organization structures which we deem appropriate to provide architectural support for including: a) synchronization for multithreaded organizations, b) garbage collection, c) array bounds checking, d) real-time caches, e) multiple machines which can time-share the same datapath (e.g., the JAVA Virtual Machine and Media Processing functions), and f) vector/dsp operations. By building several models of the DELFT-JAVA engine, we were able to characterize performance metrics of kernels executing on our processor. We found that when compared to realizable stack-based machines, our techniques could improve performance by 2.7x. Furthermore, by converting stack-based dependencies into pipeline dependencies, we showed that out-of-order superscalar machines could remove up to 60% of the hazards.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call