Abstract

Dynamic binary translation (DBT) is a core technology that enables the migration of legacy software to different instruction set architectures while maintaining the original semantics. However, the development and maintenance of an efficient cross-DBT system are challenging. Key challenges include memory access overhead, inefficient instruction simulation, and frequent context switches. In this paper, we propose three novel optimization techniques. First, we formalize a register mapping cost model and investigate a hierarchical register mapping approach to bridge the memory access overhead. Second, we accelerate floating point (FP) emulation by surrounding the use of hardware FP unit with high-efficiency non-FP code. Third, we present a function inlining approach to alleviating the overhead associated with indirect control lookup. On the system side, we implement our approach on ARM64 and SW64 architectures based on QEMU and extensively evaluate the effectiveness with the SPEC2006 benchmark suite. The experimental results show that an average of 1.28× performance speedup and 13.41% code size reduction can be achieved on SW64. Similarly, on ARM64, we achieve an average of 1.15× performance speedup and 11.48% code size reduction.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call