Abstract

This talk showcases the ongoing studies on the emulation of Armv8-A Scalable Vector Extension (SVE) in Armv8-A architectures. SVE is the newest SIMD instruction set for Armv8-A, featuring scalable vector lengths enabling length-agnostic programming, gather/scatter, per-lane predication, amongst others features, targeting HPC workloads. Thanks to the Arm Instruction Emulator (ArmIE), we can run SVE workloads on real hardware which, paired with dynamic binary instrumentation (DynamoRIO), enables full application tracing without the overhead found in simulators. Traces can then be analysed or fed to simulators (e.g. cache simulator) in order to extract more meaningful metrics such as percentage of vectorization, vector utilization, cache hit/miss rates, cache access latency and more. In this talk, we dive into the current workflow of SVE dynamic binary instrumentation, from the toolset to the ongoing studies and preliminary results, providing some early findings and insights along the way. The toolset encompasses ArmIE and DynamoRIO with SVE instruction and memory tracing clients as well as a simple SVE cache simulator, enabling memory studies on the generated SVE memory traces. For the evaluation, the study considers several HPC mini-and proxy-applications, focusing on SVE utilization and the impact of SVE in the memory system, for different vector lengths (128 bits up to 2048 bits).

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.