Abstract

Major architectural innovations in the compute node have been introduced in the IBM Blue Gene®/Q, including programmable Level 1 (L1) cache data prefetching units to hide memory access latency, hardware support for transactional memory (TM) and speculative execution (SE), an enhanced five-dimensional integrated torus network, and a high-performance quad floating-point SIMD (single-instruction, multiple-data) unit. In this paper, we present the tools and methodology that we used to model, co-design, and validate these new features from early concept phase through design implementation. Early in the design cycle, we made extensive use of an architectural simulator, BGQSim, capable of executing unmodified binary Blue Gene/Q code for single as well as multiple nodes. As the hardware description language for the chip implementation became available, we complemented BGQSim with a cycle-accurate and cycle-reproducible, large-scale field-programmable gate array-based platform, Twinstar, to validate the implementation against performance targets and functional specifications. Through specific examples, we show the effectiveness of these tools in co-developing the hardware and software of Blue Gene/Q, allowing us to meet the design targets at an aggressive project schedule.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.